THE  DISCRIMINATION  OF  AUDITORY  TEMPORAL  PATTERNS:  EFFECT  OF 
TEMPORAL  POSITION,  CHANNEL,  AND  INTERTONE  TIME  STATISTICS 


By 


TOKTAM  SADRALODABAI 


A DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 
OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 
OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 
DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 


1996 


ACKNOWLEDGMENTS 


I would  like  to  thank  my  family,  especially  my  mother  and  my  sister,  from  the  bottom 
of  my  heart  for  their  continuous  support  and  encouragement:  my  mother,  for  being  a great 
role  model  throughout  my  life  as  well  as  providing  me  with  her  continuous  support  in  every 
possible  way;  my  sister,  whose  love  and  support  every  step  of  the  way  gave  me  the  strength 
and  encouragement  to  keep  going. 

Very  special  thanks  also  go  to  my  friends,  who  put  up  with  me  when  I was 
unbearable.  I especially  would  like  to  acknowledge  and  thank  my  dear  friend,  Dr.  Kourosh 
Saberi  for  his  friendship  and  endless  support  as  well  as  his  valuable  time  and  assistance 
whenever  I asked  for  or  needed  it.  I truly  appreciate  it.  I thank  Dr.  Dee  Montgomery;  her 
friendship  and  willingness  to  always  listen  and  offer  help  has  meant  the  world  to  me.  I also 
thank  Chris  Hays,  my  buddy,  whose  jokes,  pranks  and  friendship  kept  me  grounded  during 
the  time  when  I truly  needed  it,  and  Josh  Fryman,  who  has  been  a true  friend  and  will  always 
be  in  that  category.  I thank  Jennifer  Nye;  her  friendship  during  the  past  year  has  been  a 
breath  of  fresh  air.  Last  but  not  least,  I would  like  to  thank  Sean  Collins,  whose  love  and 
companionship  has  been  a source  of  support  and  comfort  during  the  most  trying  moments. 

My  special  thanks  go  to  my  advisor,  Dr.  Robert  Sorkin,  for  his  continuous  guidance 
and  patience  with  my  dissertation,  I do  appreciate  all  his  contributions.  Finally,  I would  like 
to  gratefully  acknowledge  my  committee  members,  Dr.  Jeff  Farrar,  Dr.  David  Green,  Dr. 
John  Middlebrooks,  and  Dr.  David  Perrott  for  their  valuable  time  with  my  study. 

This  research  was  supported  by  grants  from  the  Air  Force  Office  of  Scientific 
Research. 


11 


TABLE  OF  CONTENTS 


page 

ACKNOWLEDGMENTS jj 

LIST  OF  FIGURES 

ABSTRACT  ix 

INTRODUCTION  1 

Pattern  Recognition 2 

Coding  of  Information  in  the  Pattern  of  Auditory  Stimuli 4 

Related  Physiological  Reports  11 

Current  Experiments  12 

THE  MODEL  FOR  WEIGHT  ESTIMATION  14 

DENSITY  FUNCTIONS  FOR  TEMPORAL  PATTERN  DISCRIMINATION 20 

EXPERIMENT  I 29 

Method  29 

Results 35 

EXPERIMENT  II 45 

Method  46 

Results 50 

EXPERIMENT  III  88 

Method  88 

Results 91 

GENERAL  DISCUSSION  95 

iii 


Binaural  Pattern  Discrimination  and  Some  Implications  98 

Ideal  Distribution  of  Weights  in  a Pattern  Discrimination  Task 104 

Efficiency  of  Weight  Distributions  108 

Guidelines  for  Future  Experiments 114 

REFERENCES 119 

BIOGRAPHICAL  SKETCH 124 


IV 


LIST  OF  FIGURES 


Figure 


page 


1.  Variance  of  the  absolute  difference  distribution  [VAR(g)]  is  plotted  as  a function 

of  the  values  of  the  standard  deviation  of  the  intertone  times  24 

2.  The  mean  of  the  absolute  difference  distribution  [E(g)]  is  plotted  as  a function  of 

the  values  of  the  standard  deviation  of  the  intertone  times 25 

3 . d'  is  plotted  as  a function  of  the  values  of  the  standard  deviation  of  the  intertone 

times.  The  star  symbols  represent  the  predictions  from  ROC  analysis.  The  solid 
line  represents  a power-function  fit  to  predictions  (Eq.  20) 28 

4.  The  top  part  of  this  figure  represents  a sample  of  SAME  trials  for  the  two 

sequences  of  a pattern,  whereas  the  bottom  portion  represents  a sample  of 
DIFFERENT  trials.  The  frequency  and  duration  of  each  tone  was  1000  Hz  and 
25  ms  respectively 33 

5.  In  this  figure,  each  sequence  is  divided  into  2 subsequences,  A and  B.  A 50  ms 


gap  separates  the  two  subsequences.  The  top  part  of  the  figure  represents  a 
sample  of  SAME  trials  for  the  two  sequence  of  a pattern.  The  bottom  part 
represents  a sample  of  DIFFERENT  trials 34 

6.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  The 

sequence  correlation  is  equal  to  0.  Each  panel  represents  data  obtained  from  one 
listener 36 

7.  Relative  weights,  averaged  form  4 listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  sequence  correlation  is  equal  to  0.  The  error  bars 
represent  one  standard  error  of  the  mean 37 

8.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  The 

sequence  correlation  is  equal  to  0.8.  Each  panel  represents  data  obtained  from 
one  listener 38 

9.  Relative  weights,  averaged  form  4 listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  sequence  correlation  is  equal  to  0.8.  The  error  bars 
represent  one  standard  error  of  the  mean 39 


10.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Both,  the 
sequence  correlation  and  the  repetition  correlation  are  equal  to  0.  Each  panel 


v 


represents  data  obtained  from  one  listener  41 

1 1 . Relative  weights,  averaged  form  4 listeners,  are  plotted  as  a function  of  each 

temporal  position.  Both,  the  sequence  correlation  and  the  repetition  correlation 
are  equal  to  0.  The  error  bars  represent  one  standard  error  of  the  mean 42 

12.  The  upper  part  of  this  figure  represents  a sample  of  the  SAME  trials,  while  the 

lower  part  represents  the  DIFFERENT  trials  for  the  control  condition  of 
Experiment  2 48 


13.  This  figure  represents  a summary  of  the  stimuli  used  in  the  first  case  of 

Experiment  2.  In  this  case,  the  second  or  the  sixth  intertone  time  has  a different 
mean  duration  than  all  others.  The  standard  deviation  is  the  same  for  all 
positions  49 

14.  This  figure  represents  a summary  of  the  stimuli  used  in  the  second  case  of 

Experiment  2.  In  this  case,  the  second  or  the  sixth  intertone  time  has  a different 
standard  deviation  than  all  others.  The  mean  duration  is  the  same  for  all 
positions 51 

15.  Control  condition  for  experiment  I.  Relative  weights  are  plotted  as  a function  of 

each  temporal  position.  Each  panel  represents  the  obtained  data  for  each 
respective  listener.  The  mean  and  standard  deviation,  for  each  position,  were  60 
and  20  ms  respectively 52 

16.  The  average  relative  weights,  from  4 listeners,  are  plotted  as  a function  of  each 
temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  and  standard  deviation,  for  each  position,  were  60  and  20  ms  respectively  .53 

17.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each  panel 

represents  data  from  one  listener.  The  mean  duration  for  the  second  position  was 
20,  40,  80,  100  ms 54 

18.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  second  position  was  20,  40,  80,  100  ms 58 

19.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each  panel 

represents  data  from  one  listener.  The  mean  duration  for  the  sixth  position  was 
20,  40,  80,  100  ms 63 

20.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  sixth  position  was  20,  40,  80,  100  ms 67 


vi 


21.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  mean 

values  for  the  second  temporal  position 71 

22.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  mean 

values  for  the  sixth  temporal  position 72 

23.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each  panel 

represents  the  data  from  one  listener.  The  standard  deviation  for  the  second 
position  was  40,  60,  100  ms 74 

24.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  second  position  was  40,  60,  100  ms 77 

25.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each  panel 

represents  the  data  from  one  listener.  The  standard  deviation  for  the  sixth 
position  was  40,  60,  100  ms 80 

26.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  sixth  position  was  40,  60,  100  ms 83 

27.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  standard 

deviation  values  for  the  second  temporal  position 86 

28.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  standard 


deviation  values  for  the  sixth  temporal  position 87 

29.  The  top  portion  represents  a sample  of  SAME  trials  for  the  two  sequences  of  a 

pattern  in  the  random-mode  binaural  condition.  The  bottom  portion  represents  a 
sample  of  DIFFERENT  trials  in  this  condition.  The  frequency  and  duration  of 
each  tone  was  1000  Hz  and  25  ms  respectively 90 

30.  The  abscissa  represents  the  experimental  conditions  and  the  ordinate  represents 

the  percent  correct.  Each  panel  represents  the  data  from  one  listener 92 

3 1 . Average  percent  correct  is  plotted  as  a function  of  the  experimental  conditions. 

The  error  bars  represent  one  standard  error  of  the  mean 93 

32.  Average  percent  correct , for  4 listeners,  is  plotted  as  a function  of  the 

experimental  conditions.  The  solid  line  represents  the  obtained  data,  and  the 
dashed  line  represents  the  model's  prediction 100 


33.  The  data  connected  by  the  solid  line  represent  the  average  relative  weights,  from 


Vll 


all  listeners,  as  a function  of  each  temporal  position.  The  dashed  line  represents 
the  ideal  weights.  The  standard  deviation  for  the  sixth  position  was  40,  60,  100 


ms 109 

34.  Efficiency  calculations  from  Experiment  I.  The  ordinate  represents  the  efficiency 

value  and  the  abscissa  represents  the  correlation 113 

35.  Efficiency  calculations  from  part  1 of  Experiment  II.  The  ordinate  represents  the 

efficiency  as  a function  of  the  value  of  the  mean  intertone  time  for  the  second 
position 115 

36.  Efficiency  calculations  from  part  1 of  Experiment  II.  The  ordinate  represents  the 

efficiency  as  a function  of  the  value  of  the  mean  intertone  time  for  the  sixth 
position 116 


vm 


Abstract  of  Dissertation  Presented  to  the  Graduate  School  of  the  University  of  Florida  in 
Partial  Fulfillment  of  the  Requirements  for  the  Degree  of  Doctor  of  Philosophy 


Tfffi  DISCRIMINATION  OF  AUDITORY  TEMPORAL  PATTERNS:  EFFECT  OF 
TEMPORAL  POSITION,  CHANNEL,  AND  INTERTONE  TIME  STATISTICS 

By 

Toktam  Sadralodabai 
May  1996 


Chairman:  Robert  D.  Sorlcin 
Major  Department:  Psychology 

For  all  three  experiments,  listeners  were  presented  with  2 sequences  of  9 tones.  The 
tone  durations  were  always  25  ms  and  the  sequence  patterns  were  defined  by  intertone  times. 
The  listener  had  to  decide  if  the  two  sequences  were  the  same  or  different.  Three  studies 
examined  the  importance  of  temporal  position,  mean  and  variance  of  intertone  times  and  the 
effect  of  auditory  channel  on  listeners'  decisions  in  an  auditory  pattern-discrimination  task. 

The  first  study  examined,  using  a correlation-weight-estimation  technique,  the  ability 
of  listeners  to  allocate  decision  weights  to  each  intertone  time  of  the  auditory  temporal 
pattern.  This  study  investigated  the  importance  of  ordinal  position  within  the  sequence  on 
listeners'  decisions.  Results  showed  that  all  listeners  consistently  allocated  different  weights 
to  different  temporal  positions,  with  the  first  position  carrying  the  largest  weight.  The  effect 


IX 


of  repetition  of  some  of  the  components  in  the  sequence  (rhythmicity)  on  discrimination  of 
the  temporal  patterns  was  also  tested.  Results  showed  that  this  alteration  did  not  have  a 
noticeable  effect  on  performance. 

The  second  study  examined  the  ability  of  listeners  to  allocate  weights  to  temporal 
positions  based  on  the  diagnosticity  of  the  information  provided  by  each  component’s 
position.  By  diagnosticity,  it  is  meant  either  the  magnitude  of  that  information  or  its 
statistical  difference  from  the  other  intertone  times.  Results  indicated  that  assigning 
different  standard-deviation  values  affected  the  listeners'  weight-distribution  strategy  greatly, 
whereas  assigning  different  mean  values  did  not  have  a noticeable  influence. 

The  third  study  examined  the  effect  of  sequence  presentation  across  auditory 
channels.  In  one  condition,  the  ear  to  which  each  tone  of  the  sequence  was  presented  was 
chosen  randomly,  and  in  a second  condition,  the  tones  were  presented  in  alternating  order 
between  the  two  ears.  The  results  demonstrated  that  when  the  tones  of  a sequence  were 
presented  randomly  to  each  channel,  performance  was  at  chance  (0.5  probability  of  correct 
response  in  a 2 IFC  task).  However,  when  the  tones  were  presented  alternately  and  listeners 
were  expecting  the  stimulated  channel,  their  performance  greatly  improved  (approximately 
0.75-0.80  probability  of  correct  response). 


x 


INTRODUCTION 


The  efficiency  with  which  the  auditory  system  encodes  temporal  patterns  of  acoustic 
information  is  not  only  theoretically  important  for  what  it  may  reveal  about  the  underlying 
temporal  mechanisms  of  hearing,  but  it  also  has  important  practical  implications  for  speech 
and  music  perception.  Auditory  pattern  discrimination  may  itself  be  considered  a 
subcategory  of  a broader  topic  of  intense  interest  in  sensory  psychology,  that  of  pattern 
recognition.  The  following  sections  present  a brief  summary  of  some  of  the  basic  ideas  and 
descriptions  related  to  pattern  recognition  and  its  importance  to  the  coding  of  sensory  and 
perceptual  information.  Pattern  recognition  has  been  mostly  examined  for  the  visual  system 
and  several  computational  models  of  visual  pattern  recognition  have  been  advanced  (Marr, 
1982).  The  literature  related  to  auditory  pattern  recognition  may  be  separated  into  two 
domains,  those  related  to  spectral  and  temporal  patterns.  The  former  area  has  become  known 
as  profile  analysis  or  spectral-shape  discrimination.  The  latter  area,  temporal  pattern 
recognition,  has  been  studied  for  such  stimulus  parameters  as  the  frequency  content  of  each 
tone  of  a sequence  that  comprises  the  pattern.  The  current  set  of  experiments  investigates 
the  importance  of  factors  related  to  the  discrimination  of  auditory  temporal  patterns  in  a 
same-different  paradigm.  These  patterns  are  defined,  not  by  the  frequency  content  of  the 
constituent  tones,  but  by  the  intertone  times  between  tones  of  the  same  frequency.  In 
addition,  a newly  described  technique  of  psychophysical  measurement  (Richards  and  Zhu, 


1 


2 


1994;  Lutfi,  1995),  designed  to  derive  weights  for  each  component  of  a multi-component 
stimulus,  is  used  to  determine  some  detailed  properties  of  each  defining  part  of  the  auditory 
pattern.  Specifically,  three  areas  of  auditory  temporal  pattern  recognition  are  examined.  The 
first  study  examines  the  importance  of  temporal  location  of  the  pattern  information.  In  other 
words,  do  all  parts  of  a pattern  carry  equal  weight  in  an  observer’s  ability  to  recognize  that 
pattern?  The  second  study  tests  the  effects  of  what  is  termed  “diagnosticity”  of  the  temporal 
information  on  listeners’  performance.  If  some  parts  of  the  pattern  were  made  more 
informative,  i.e.,  more  diagnostic,  would  an  observer’s  strategy  in  attending  to  certain  parts 
of  the  pattern  be  affected  by  such  information?  The  third  study  examines  the  effect  of 
presenting  the  temporal  sequences  across  auditory  channels  (i.e.,  dichotically).  We  begin 
with  a brief  description  of  pattern  recognition  and  its  importance  to  sensory  information 
processing. 


Pattern  Recognition 

One  of  the  primary  functions  of  perceptual  systems  is  to  convert  raw  and 
unstructured  sensory  data  into  coherent  and  organized  information.  Perception  is  partially 
geared  to  reduce  the  overwhelming  amount  of  data  by  determining  whether  a certain  piece 
of  sensory  data  belongs  to  a class  of  information  that  may  be  of  interest  to  the  organism. 
Those  perceptual  processes  by  which  some  part  of  the  experience  is  determined  to  belong 
to  a certain  class  of  information  is  termed  "pattern  recognition"  (e.g.,  Lindsay  and  Norman, 
1972;  Dodwell,  1970).  Thus,  pattern  recognition  can  be  defined  as  the  process  of  assigning 


3 


a certain  internal  meaning  to  some  experience  as  well  as  specifying  the  appropriate  response 
(Murch,  1973).  Pattern  recognition  is  central  to  the  whole  perceptual  process.  The  fact  that 
a natural  environment  seldom  presents  us  with  isolated  signals,  such  as  sounds  or  sights,  but 
rather  patterns  of  stimuli  to  which  we  have  to  respond,  points  to  an  important  aspect  of 
pattern  recognition.  When  data  is  presented  in  some  form,  the  pattern  itself  can  carry 
information  as  to  the  nature  or  identity  of  the  signal.  The  ability  to  identify  relevant  signals 
may  be  one  reason  why  one  fundamental  property  among  the  living  organisms  is  the  ability 
to  recognize  and  distinguish  different  patterns.  It  is,  therefore,  not  surprising  that 
information  is  usually  summarized  in  some  form  of  pattern  in  almost  any  complex  sensory 
domain,  especially  in  hearing  and  vision.  In  these  two  systems,  there  has  been  relatively 
more  research  and  theoretical  speculation  on  the  organization  of  information  in  terms  of 
patterns  (e.g.,  Hubei  and  Wiesel,  1963;  Marr,  1982;  Movshon  et  al.  1985;  Siebert,  1968). 

All  sensory  systems,  regardless  of  their  differences,  extract  some  of  the  same  basic 
information  from  stimuli.  This  may  be  one  reason  for  some  similarities  in  the  physiological 
organization  among  different  sensory  systems  (Colavita  et  al.,  1974).  For  instance,  in  a 
visual  scene,  the  visual  system  is  presented  with  a tremendous  amount  of  information  that 
requires  encoding.  One  way  of  dealing  efficiently  with  such  information  is  to  reduce  the 
amount  of  information  by  utilizing  redundancy  in  that  information.  In  dealing  with  spatially 
periodic  visual  patterns,  for  example,  such  an  opportunity  is  presented  to  the  visual  system. 
There  are  many  examples  in  a natural  environment  that  posses  such  periodicities  in  their 
structures.  It  has  been  suggested  by  Campbell  and  Robson  (1968)  and  DeValois  and 
DeValois  (1988)  that  the  spatial-frequency  spectra  of  visual  stimuli  are  used  for  encoding 


4 


of  information  about  the  shape  or  spatial  position  of  these  stimuli  . They  suggest  that  the 
periodic  patterns  in  many  naturally  occurring  visual  objects  can  be  encoded  efficiently  via 
their  local  spatial-frequency  content.  For  example,  the  repetitive  structure  of  the  leaves  of 
most  plants  makes  the  dominant  spatial  frequency  of  their  patterns  a distinctive  feature  that 
the  visual  system  would  ideally  process  (DeValois  and  DeValois,  1988).  In  animals, 
repetitive  characteristic  features,  such  as  a bird's  feathers,  or  wrinkle  patterns  in  the  skin,  are 
also  visually  useful  periodic  patterns.  Again,  these  patterns  can  be  decoded  in  the  spatial 
frequency  domain.  For  both  examples,  there  is  considerable  amount  of  redundant 
information  due  to  the  periodicity  in  the  patterns.  Thus,  in  the  case  of  these  patterns,  the 
visual  system  seems  to  utilize  local  spatial  periodicities  in  a very  efficient  manner.  The 
important  point  is  that  a specification  of  the  overall  mean  spatial  frequency  of  the  pattern 
gives  the  visual  system  the  important  spatial  and  structural  information  that  could  be 
virtually  impossible  to  encode  in  terms  of  individual  points  or  edges.  Thus,  the  periodic 
nature  of  visual  patterns  can  be  considered  as  one  important  example  of  how  sensory 
systems  have  evolved  to  efficiently  recognize  natural  patterns  of  information.  Another 
example,  the  topic  of  this  dissertation,  is  the  ability  and  efficiency  of  the  auditory  system  in 
encoding  patterns  of  acoustic  information. 

Coding  of  Information  in  the  Pattern  of  Auditory  Stimuli 

In  the  auditory  system,  pattern  information  is  primarily  carried  in  terms  of  the 
spectral  and  temporal  structures  of  the  stimulus.  Therefore,  in  dealing  with  complex  sounds, 


5 


the  sounds  can  be  either  patterned  according  to  their  frequency  content  (profile)  or  how  they 
vary  as  a function  of  time,  or  both.  Hence,  these  two  aspects  of  the  sound  stimulus  are  of 
great  importance  in  pattern  coding.  The  design  of  the  auditory  system  seems  to  be  such  that 
it  enables  an  efficient  detectability  of  unexpected  changes  of  both  of  temporal  and  spectral 
patterns  of  stimulation  (Moore,  1988).  An  example  of  studies  in  the  spectral  domain  would 
include  profile  analysis  experiments. 

Many  studies,  within  the  past  decade,  have  focused  on  how  auditory  patterns  are 
encoded  in  the  spectral  domain.  In  these  studies  (e.g..  Green  et  al.  1983;  Berg  and  Green, 
1990;  Kidd  et  al.,  1991;  Zara  et  al.,  1993),  the  signal  is  usually  an  increase  or  decrease  in  the 
level  of  one  component  of  a multi-tone  complex  that  would  result  in  a "bump"  in  an 
otherwise  flat  spectrum.  Using  a random  variation  in  overall  level  to  eliminate  the  loudness 
cue,  the  listener  has  to  detect  a change  in  the  pattern  of  levels  across  frequency.  That  is,  they 
have  to  encode  the  profile  or  shape  of  the  spectrum  of  the  complex  sound.  Results  of  these 
experiments  have  demonstrated  that  performance,  with  roving  levels  between  intervals,  is 
as  good  as  when  the  discrimination  of  the  intensity  variation  is  between  trials  (Green,  1988). 
These  results  indicate  that  listeners  make  their  decisions  based  on  the  pattern  of  the  spectral 
structure,  since  it  is  easier  to  hear  a change  in  spectral  shape  than  an  increase  in  the  intensity 
of  a single  sinusoid  when  the  level  is  roved.  Green  and  Mason  (1985)  demonstrated  that  the 
phase  relationship  among  components  was  irrelevant  for  the  detection  of  a spectral  change. 
This  is  an  important  finding  which  suggests  that,  in  theses  tasks,  it  is  the  shape  of  the  power 
spectrum  that  is  the  cue  to  detection  and  not  the  temporal  envelope  of  the  waveform.  The 
amplitude  spectrum  of  a complex  tone  is,  of  course,  unaffected  by  its  phase  spectrum; 


6 


however,  the  exact  phase  relationship  between  the  components  of  the  complex  does  affect 
the  temporal  structure  of  the  overall  waveform.  That  there  were  no  effects  of  component 
phase  is  strong  evidence  in  favor  of  the  idea  that  this  type  of  pattern  recognition  occurs  in 
the  spectral  domain  and  not  the  temporal  domain.  Green  and  Mason  (1985)  also  investigated 
the  role  of  signal  frequency  in  profile  discrimination.  Their  results  indicated  that  for 
moderate  frequencies,  listeners  detected  the  spectral  changes  in  the  center  of  the  complex 
more  easily  than  at  either  extreme,  providing  further  evidence  that  the  most  important  detail 
for  profile  discrimination  and  the  primary  source  of  information  must  be  the  pattern  of  the 
spectrum. 

The  other  important  domain  in  which  auditory  patterns  are  encoded  is  the  temporal 
domain.  Since  there  are  many  naturally  periodic  auditory  stimuli,  it  makes  sense  for  the 
auditory  system  to  perform  a temporal  encoding  of  such  information.  Temporal 
discrimination  of  patterns  is  not  only  important  as  a basic  psychoacoustic  issue,  but  it  also 
plays  an  important  role  in  the  perception  of  speech  and  music  (Fraisse,  1966;  Martin,  1972; 
Steedman,  1977;  Vos  and  Rasch,  1981).  For  example,  normal  speech  perception  demands 
the  recognition  and  identification  of  sequences  of  acoustic  stimuli,  both  at  a phonetic  level 
and  at  higher  more  cognitive  level  (Klatt,  1976;  Liberman  et  al.,  1967;  Stevens  and  House, 
1972).  Therefore,  the  discrimination  of  sequences  of  sounds  must  surely  be  a prerequisite 
for  the  appreciation  of  speech.  Another  important  and  common  example  is  the  perception 
of  music.  Again,  it  is  the  repetitive  pattern  of  musical  notes,  both  at  a basic  level  and  at 
more  complex  level  of  rhythmicity,  that  requires  the  use  of  a temporal  pattern-recognition 
scheme  (Deutsch,  1979;  Fraisse,  1982;  Martin  and  Struges,  1974).  These  are  just  some 


7 


reasons  why  temporal  pattern  discrimination  has  been  the  subject  of  many  studies  in 
psychoacoustics  over  the  past  two  decades.  In  most  of  these  studies,  either  the  timing  or  the 
frequency  aspect  of  the  tones  in  the  temporal  patterns  is  manipulated.  These  studies  can  be 
categorized  into  3 different  groups. 

The  first  experimental  paradigm  include  those  by  Watson  and  colleagues.  Their 
studies  (e  g.,  Watson  et  al.,  1975,  Watson  et  al.,  1976,  Kidd  and  Watson,  1992)  are 
concerned  largely  with  the  frequency  aspect  of  the  tone  sequence.  In  these  studies,  listeners 
are  presented  with  two  sequences,  in  which  the  frequency  of  one  of  the  tones  is  randomly 
varied  and  listeners  are  asked  to  judge  if  the  two  patterns  of  the  sequences  are  the  same  or 
different.  In  one  of  their  initial  studies  (Watson  et  al.,  1975),  results  indicated  that  the 
ordinal  position  of  the  information  is  important;  listeners'  performance  was  better  when  the 
change  in  the  pattern  occurred  towards  the  end  of  the  sequence.  This  phenomenon  was 
called  the  recency  effect.  Watson  et  al.  (1976)  applied  the  term  "informational  masking"  to 
refer  to  the  phenomenon  that  later-occurring  components  interfered  with  the  processing  of 
the  earlier  components.  Watson  et  al.  (1986)  have  also  explored  the  effect  of  uncertainty  on 
the  listener's  performance,  by  employing  one  tone  as  the  target  tone  and  the  remaining  tones 
within  a sequence  as  context  tones.  In  this  study,  three  levels  of  uncertainty  were  examined. 
First,  high  uncertainty  represented  the  condition  when  the  context  tones  were  drawn  from 
a continuous  distribution.  Second,  medium  uncertainty  represented  the  condition  when  the 
context  tones  were  drawn  form  one  of  ten  predetermined  samples.  Third,  minimal 
uncertainty  was  the  condition  where  only  one  sequence  of  tones  was  used  as  context  tones 
and  the  test  tone  was  always  presented  at  the  same  temporal  position  and  frequency.  Results 


8 


suggested  that  listeners'  performance  was  at  its  best  when  the  uncertainty  of  both,  the  target 
and  the  context  tone,  was  low.  In  a recent  study  by  Kidd  and  Watson  (1992),  it  was 
suggested  that  an  important  factor  in  temporal  pattern  discrimination  is  the  target's 
proportion  of  the  pattern's  total  duration.  Performance  improved  with  an  increase  in  the 
target's  proportion  of  the  total  pattern's  duration.  Consequently,  they  proposed  a model, 
appropriately  called  “the  proportion-of-the-total-pattem-duration  (PTD)”  rule.  The  PTD  rule 
states  that  each  individual  component  of  an  unfamiliar  tone  sequence  is  resolved  with  an 
accuracy  that  is  based  on  its  proportion  of  the  total  duration  of  the  sequence.  Lutfi  (1993), 
however,  has  suggested  that  Watson  and  colleagues'  results  can  be  accounted  for  by  an 
alternative  model,  termed  CoRE,  which  uses  as  its  basis  Shannon's  (1948)  definition  of 
Entropy  as  a means  of  determining  the  amount  of  information  contained  in  a pattern.  The 
CoRE  (component  relative  entropy)  model  suggests  that  the  most  important  factor  in  these 
studies  is  the  relative  variance  of  the  target  tone.  For  these  tasks,  performance  can  be 
predicted  by  a decision  variable  that  is  based  on  the  weighted  sum  of  the  relative  variances 
of  each  component's  mean  value.  However,  additional  research  is  needed  to  examine  which 
model  is  more  effective  in  predicting  performance. 

The  second  experimental  paradigm  related  to  the  discrimination  of  the  temporal 
patterns,  involves  ideal-observer  analysis  from  Signal  Detection  Theory  (Green  and  Swets, 
1966).  These  set  of  studies  have  been  primarily  conducted  by  Sorkin  and  his  colleagues 
(e.g.,  Sorkin  1990;  Sorkin  and  Montgomery,  1991).  One  important  difference  between  the 
Sorkin  et  al.  studies  and  Watson  et  al.  is  the  analysis  of  optimum  decision  processes  by 
Sorkin  et  al.  This  has  allowed  a wider  range  of  stimulus  configurations  and  testing 


9 


paradigms.  In  these  studies,  the  listener  is  presented  with  two  sets  of  sequences  that  are 
comprised  of  tones  and  defined  by  intertone  times.  The  task,  again,  is  a same-different  task. 
The  listener  has  to  decide  if  the  two  sets  of  sequences  have  the  same  or  different  temporal 
patterns.  In  one  of  the  initial  studies,  Sorkin  and  colleagues  systematically  varied  the 
temporal  envelopes  between  sequences  by  changing  the  correlation  between  corresponding 
intertone  times  in  the  two  sequences.  Results  indicated  that  the  listeners'  performance, 
measured  in  terms  of  the  sensitivity  index  (d'),  dropped  as  the  correlation  between  the  two 
sequences  increased  (the  temporal  envelopes  becoming  more  identical).  These  results  led 
to  the  proposal  of  a model  for  temporal  pattern  recognition  called  the  pattern  correlation 
model  (Sorkin,  1990),  which  essentially  has  the  following  features:  1)  listeners  are  capable 
of  keeping  in  memory  for  a very  brief  period  of  time  the  time  between  the  tone  onsets,  and 
2)  using  this  data,  they  calculate  a correlation  between  the  intertone  times  and  make  a 
decision  based  on  the  magnitude  of  this  correlation  (Pearson  product-moment  correlation 
coefficient).  This  model  also  incorporates  internal  noise  in  the  final  calculations  of  the 
correlation.  The  magnitude  of  the  internal  noise  is  assumed  to  grow  in  proportion  to  the 
duration  of  the  patterns. 

Evidence  supporting  this  model  as  well  as  evidence  for  a relationship  between 
temporal-pattern  discrimination  and  speech  recognition  has  been  obtained  with  cochlear 
implant  patients  in  a study  conducted  by  Collins  et  al.  (1994).  They  found  that  their 
observers’  ability  to  discriminate  temporal  patterns  depended  on  the  temporal  correlation 
between  the  two  sequences,  as  predicted  by  and  consistent  with  the  temporal-pattem- 
correlation  model.  In  addition,  these  results  demonstrated  that  the  observer's  ability  to 


10 


discriminate  arhythmic  sequences  was  positively  correlated  with  the  observer's  speech 
recognition  performance. 

The  third  experimental  paradigm  has  been  conducted  by  researchers  that  are 
concerned  with  the  role  of  rhythmicity  in  the  perception  of  temporal  patterns.  The  major 
determinant  of  rhythm  is  the  duration  of  events  (Voss  and  Rasch,  1981).  Consequently,  the 
production  of  rhythmic  sounds  implies  that  the  duration,  hence  temporal  location  of  each 
sound  element,  be  related  to  other  locations  in  the  resulting  pattern.  Thus,  one  of  the 
properties  of  a rhythmic  auditory  pattern  is  that  certain  elements  in  the  pattern  are  temporally 
redundant.  It  is  this  temporal  redundancy  that  permits  events  to  be  anticipated  in  real  time. 
Therefore,  for  the  perception  of  rhythmically  patterned  sounds,  perception  of  initial  elements 
in  a pattern  allows  later  elements  to  be  anticipated  in  time.  Applying  this  idea  of  rhythm  to 
speech  sounds  would  imply  the  concept  of  relative  timing.  This  means  that  the  location  of 
each  element  along  the  time  dimension  is  determined  relative  to  the  locations  of  all  other 
elements  in  the  sequence,  adjacent  and  nonadjacent.  It  is  this  logic  that  Martin  (1972)  uses 
to  suggest  there  are  mechanisms  for  listening  in  real  time.  According  to  Martin,  these 
mechanisms  are  optimally  designed  to  make  use  of  the  temporal  redundancies  that  are 
always  present  in  rhythmically  structured  sound  inputs.  Therefore,  the  redundancies  are 
highly  correlated  with  the  distribution  of  information  in  the  sequence  and  are  made  use  of 
during  ongoing  perception  in  a highly  efficient  way.  Many  other  studies  (e.g.,  Martin,  1972; 
Struges  and  Martin,  1974;  Halpern  and  Darwin,  1982;  Jones  et  al.,  1981)  have  noted  the 
importance  of  rhythmic  context  for  better  perception  of  temporal  patterns. 


11 


Related  Physiological  Reports 

In  one  of  the  first  physiological  studies  on  the  discrimination  of  auditory  temporal 
patterns  and  their  correlates,  Diamond  and  Neff  (1956)  showed  that  the  auditory  cortex  in 
the  cat  plays  a central  role  in  pattern  discrimination.  They  demonstrated  that  even  after 
cortical  ablation,  if  a small  part  of  either  the  primary  auditory  cortex  (AI)  or  secondary 
auditory  cortex  (All)  or  posterior  ectosylvian  field  (EP)  remains,  the  animals  were  capable 
of  relearning  the  pattern-discrimination  task.  In  another  study  by  Cornwell  (1967),  it  was 
shown  that  when  the  Insular-Temporal  (IT)  cortex  was  specifically  lesioned,  cats  were  not 
capable  of  relearning  the  auditory  pattern-discrimination  task.  In  a series  of  experiments, 
conducted  by  Colavita  (1972;  1974),  the  importance  of  IT,  irrespective  of  the  modality  (audi- 
tory, visual,  vibro-tactile)  of  the  stimuli,  was  explored  with  respect  to  its  effects  on  temporal 
pattern  discrimination.  Results  demonstrated  that  the  IT  cortex,  specifically,  plays  a 
particularly  important  role  in  such  discrimination  tasks.  It  was  suggested  that  IT  may  govern 
the  utilization  of  the  temporal  relations  of  stimuli  in  the  auditory,  visual,  and  somatosensory 
modalities.  In  a study  conducted  by  Dewson  et  al.  (1970)  on  rhesus  monkeys,  strong 
evidence  was  observed  that  suggested  that  the  superior  temporal  gyrus  of  cortex  is  involved 
in  the  monkey's  use  of  temporally  patterned  information. 

Karasseva’s  (1972)  study  of  human  patients  who  had  lesions  of  the  superior  parts  of 
the  temporal  lobe  showed  that  these  patients  had  trouble  with  perception  of  short-duration 
sounds  as  well  as  in  estimating  auditory  rhythms  presented  at  a rapid  rate.  It  is  important  to 
note  that  such  ablations  usually  did  not  distort  other  perceptual  or  motor  abilities  and  were 


12 


therefore  considered  as  specifically  implicating  rhythmic  and  temporal  pattern  recognition 
abilities.  It  seems,  therefore,  that  most  physiological  studies  have  implicated  several  areas 
of  the  auditory  cortex  as  well  as  areas  that  are  not  specifically  auditory  in  the  estimation  of 
time  relationships  between  sounds.  In  addition,  since  the  discrimination  of  complex  sounds, 
such  as  speech,  is  upset  after  cortical  lesion,  it  is  suggested  that  the  human  auditory  cortex 
plays  an  important  role  in  the  perception  of  phonetic  patterns  (Luria,  1966). 

Current  Experiments 

The  purpose  of  the  present  experiments  is  to  better  understand  the  process  of 
temporal  pattern  discrimination  through  investigating  some  of  the  hierarchical  steps  that  are 
involved  in  the  analysis  of  temporal-order  information  and  through  examining  the  effects  of 
the  detailed  structure  of  the  patterns  on  their  discrimination.  The  importance  of  each  factor 
is  examined  by  determination  of  decision  weights  given  by  listeners  to  each  component  of 
the  pattern.  In  the  past,  results  from  various  studies  have  implied,  qualitatively,  the 
importance  of  some  factors  in  the  discrimination  of  temporal  patterns.  However,  here,  the 
importance  of  similar  factors  can  be  determined  quantitatively. 

In  experiment  I,  a weight-estimation  technique  is  used  to  determine  the  importance 
of  each  ordinal  position  of  temporal  information  within  a pattern  in  the  discrimination  of 
auditory  temporal  patterns.  Weights  were  derived  for  each  and  every  temporal  position.  The 
results  from  this  experiment  provided  some  unexpected  data  that  are  discrepant  from  those 
suggested  by  Watson  et  al.'s  recency  effect  (1975). 


13 


In  experiment  II,  the  intertone  times  were  identical  to  those  used  in  experiment  I, 
except  for  the  intertone  time  of  one  of  the  positions  within  each  sequence.  This  intertone 
time  had  a different  statistic  than  all  the  other  positions.  The  purpose  of  this  experiment  was 
to  examine  the  effect  of  diagnosticity  of  the  information  on  the  listener's  decision.  In  the 
first  case  of  this  experiment,  the  mean  of  the  intertone  time  was  the  statistic  that  was  varied. 
In  the  second  case,  the  standard  deviation  of  the  intertone  time  was  different  than  that  of  the 
remaining  intertone  times. 

In  experiment  III,  the  effect  of  presenting  the  patterns  in  different  auditory  channels 
(different  ears)  was  examined.  Each  tone  of  each  pattern  was  presented  either  randomly  to 
a different  channel,  or  alternately  to  each  channel.  Result  from  this  experiment  provides  a 
basis  for  understanding  the  possible  different  neural  sites  that  may  be  important  in  pattern 
recognition. 

We  begin  with  a description  of  the  analysis  technique  used  in  the  current  experiments 
to  estimate  the  weight  or  importance  of  the  different  parts  of  an  auditory  temporal  sequence. 


THE  MODEL  FOR  WEIGHT  ESTIMATION 


One  important  aspect  of  understanding  pattern  perception  is  the  weight  or  salience 
given  to  the  individual  elements  of  the  pattern.  Recently,  Richards  and  Zhu  (1994)  and 
Lutfi  (1995)  suggested  a correlational  analysis  technique  that  allows  one  to  assess  the 
importance  or  weights  given  to  each  component  of  a multi-component  complex  stimulus. 
The  collection  of  weights  (or  relative  weights)  provide  a measure  of  the  importance  of  each 
component  in  a discrimination  or  detection  task.  In  the  current  experiments,  we  have  used 
this  correlation  technique  for  deriving  observers’  weights  for  intertone  times  of  sequences 
at  each  ordinal  position.  Other  similar  methods  within  the  past  few  years  have  also  been 
proposed  (e.g.  Berg,  1989;  Berg,  1990).  The  recent  correlational  method  for  deriving 
weights  uses  some  of  the  same  principles  as  those  originally  proposed  by  Berg's  analysis. 
Because  some  of  the  earlier  studies  of  Sorkin  and  colleagues  (e.g.  Sadralodabai  et  al.,  1993) 
used  this  technique,  the  technique  will  briefly  be  described  below  to  allow  comparison  with 
the  data  collected  here  using  the  correlational  method  and  to  make  the  reader  aware  of  some 
of  the  differences  in  the  assumptions  made  by  the  two  techniques  and  differences  in  the 
resultant  data. 

Berg's  analysis,  termed  COSS  (conditional-on-a-single-stimulus),  is  an  empirical 
method,  designed  within  the  framework  of  Signal  Detection  Theory.  This  method  is  used 
to  determine  the  importance  of  each  observation  or  component  in  a multi-component 


14 


15 


detection  or  discrimination  task.  The  goal  of  this  analysis  is  to  specify  the  role  that  each 
component  plays  independent  of  all  the  other  components.  The  principle  ingredient  for 
estimating  the  effect  that  each  element  has  on  the  observer's  decision  is  the  perturbation  in 
the  magnitude  of  each  element.  This  type  of  analysis  assumes  that  the  observation  is  made 
from  one  of  2 normal  distributions  with  means  of  either  (ps)  for  signal  and  (pj  for  the  noise 
distributions  and  standard  deviation  of  (oc)  for  both  distributions.  For  each  observation,  the 
subject  has  to  decide  if  that  observation  is  signal  (s)  or  noise  (n).  In  order  to  implement  the 
COSS  technique,  the  signal  and  noise  trials  are  separated  into  equal  groups  or  bins  that  are 
used  to  calculate  the  proportion  of  correct  responses  conditioned  on  the  values  of 
perturbation  of  each  component.  The  slope  of  the  conditionally  determined  psychometric 
function,  for  each  component,  determined  from  the  proportion  of  correct  responses  at  each 
bin  is  related  to  the  relative  weight  given  to  that  component.  The  assumption  is  that  the 
steeper  this  slope  is,  the  more  weight  is  given  to  that  component  and  the  shallower  the  slope, 
the  less  weight  is  given  to  that  component.  The  weights  are  relative  and  may  be  obtained 
by  comparing  the  weight  derived  for  each  observation  to,  for  example,  the  sum  of  the 
weights,  which  produces  final  weights  that  are  normalized  to  unity  (Sai=l).  The  COSS 
analysis  has  been  successfully  used  in  pattern  recognition  tasks  (e.g.,  Sadralodabai  et  al., 
1993;  Montgomery  and  Sorkin,  In  Press). 

The  correlational  method  is  computationally  simpler.  Correlational  analysis  is 
similar  to  COSS  in  that  it  requires  perturbation  of  the  value  of  each  component.  According 
to  this  analysis,  each  observation  may  be  presented  from  one  of  the  two  cases  (signal  or 
noise)  and  the  observer  must  decide  which  case  was  presented. 


16 


The  observer's  decision  in  a temporal  pattern  recognition  task  can  be  represented  as 
the  decision  variable  D, 

m 

(1)  D=a  lAtl+a2Ai2+...+e  = ^2aAti+E 

i = l 

where  ai  are  the  weights  for  the  intertone  times  A/,  that  comprise  the  temporal  pattern.  The 
error  term  (e)  represents  the  accumulation  of  additive  errors  occurring  both  before  and  after 
the  application  of  weights.  If  we  assume  that  A/,  are  statistically  independent,  then  the  above 
equation  can  be  seen  as  the  general  equation  for  multiple  regression  analysis;  the  decision 
variable  (D)  may  be  considered  as  the  dependent  variable  and  A/,  as  the  linear  predictor 
variables.  With  these  assumptions,  the  total  variance  in  D can  be  divided  into  n+1  parts:  the 
variance  in  D related  to  the  m intertone  times  At,,  and  to  the  error  term  (s)  respectively. 


(2) 


<*d  = 


2 2 
a 2°  At, 


2 2 
3°  A* 


a2  a2 

m At_ 


+ a 


Or,  rewritten  in  terms  of  the  proportion  of  the  total  variance  accounted  for. 


(3)  1 = P DAf,  + PdA,2  + P 2DAh  + •••  + P2£>Afm  + Pfl, 

The  pA/  (correlation  coefficients)  are  obtained  from  the  linear-least  squares  regression  of 


D on  Atj,  and  D on  A t2,  etc.  The  weights  can  then  be  computed  from: 


17 


= Pd*. 

' °D  ' 

> °2  ~ P D*, 

' °D  ' 

[%] 

which  merely  corresponds  to  the  regression  coefficients  in  the  analysis.  The  relative  weights 
can  be  derived  by  combining  these  terms  to  arrive  at  an  alternative  expression  for  two 
arbitrary  intertone  times, 


(4)  ^l_  _ PgAf,°Ar2 

a7  PDAr20Ar, 

In  order  to  compute  relative  weights  for  a temporal  position,  all  that  is  needed  is  the 
estimates  of  pAr  . However,  it  is  important  to  note  that  when  applying  Eq.  4,  D would 

correspond  to  an  observer's  subjective  decision  variable.  Since  D cannot  be  determined 
directly,  it  is  not  possible  to  obtain  a direct  estimates  of  pAf  . Thus,  p^  have  to  be  inferred 

from  the  relation  of  the  subject's  responses  R's  (either  1 for  correct  or  0 for  incorrect)  to  D. 
The  estimate  of  pAf  can  be  obtained  from  the  point-biserial  correlations  (rMf ),  where  r ^ 

are  simple  product-moment  correlations,  which  can  be  computed  from  standard  formula  for 
obtaining  correlations 


(5) 


- /ffl A/, 

JnXR2  - (Sfl)2  A/j2  - S(A/j)2 


Using  this  correlation  definition,  relative  weights  can  be  estimated  by 


18 


^ _ rRh\°M1 

<h  rRAh°^ 


where  are  the  perturbations  introduced  by  the  experimenter. 

Both  the  COSS  technique  and  the  correlation-based  method  allow  a flexible  means 
of  quantifying  the  relative  weight  given  to  each  component  of  a pattern.  Both  techniques  are 
immune  from  the  effects  of  additive  internal  noise  (Berg,  1990;  Lutfi,  1995),  but  not 
multiplicative  internal  noise  (Lutfi,  1992;  Richards  and  Zhu,  1994). 

Also,  it  should  be  noted  that  an  additional  appealing  feature  of  the  correlational 
method  is  that  statistical  tests  may  readily  be  used  to  examine  the  significance  of  differences 
between  weights  (Lutfi,  1995),  or  confidence  interval  on  weights  can  be  obtained  to  test  the 
hypothesis  that  the  value  of  the  obtained  weights  differ  from  some  other  value,  for  example, 
that  obtained  by  a theoretical  observer.  Another  advantage  of  the  correlational  method  is 
that  in  order  to  obtain  an  estimate  of  weight  for  a given  observation,  all  that  is  required  is  to 
compute  the  correlation  between  that  observation  and  the  observer's  response.  If  the 
assumption  of  the  independence  does  not  hold,  then  partial  correlations  can  be  used  to 
estimate  the  observer's  weights.  This  is  required  in  one  of  our  experiments  where  we  have 
a reason  to  believe  that  a correlation  introduced  between  the  perturbations  of  components 
may  have  a noticeable  effect.  In  such  cases,  the  correlation  may  be  replaced  with  the 
partial  correlation  (Hays,  1963), 


(7) 


rRbtxA  r2 


^-rFAh2)(l~r^2 


19 


This  partial  correlation  reflects  R's  (observer’s  response)  relation  to  A t,  (intertone  time)  after 
being  adjusted  for  by  the  relation  of  each  (R  and  At;)  to  A Similar  computations  may  be 
carried  out  for  all  At,, 


DENSITY  FUNCTIONS  FOR  TEMPORAL  PATTERN  DISCRIMINATION 


In  this  section,  some  properties  of  the  distributions  of  intertone  times  are  described. 
While  the  distribution  of  the  perturbations,  discussed  in  the  previous  section,  is  Gaussian, 
the  cues  that  listeners  would  be  likely  to  use  in  the  same-different  discrimination  task  is  the 
absolute  value  of  the  difference  in  the  values  of  perturbations.  It  should  be  noted  here  that 
the  multiplication  of  the  perturbation  values  (Sorkin,  1990)  is  another  possibility  that  may 
be  used  for  the  discrimination  task.  However,  one  important  reason  for  choosing  the 
absolute-difference  decision  rule  is  the  feasibility  of  this  rule  for  the  physiological  basis  of 
this  task.  In  addition,  the  difference  principle  has  been  successfully  used  by  other 
researchers  in  black  box  models  of  signal  detection  such  as  the  equalization-cancellation 
model  (e.g.,  Durlach,  1972).  Using  the  absolute  difference  values  of  perturbations  means 
that  both,  a positive  intertone  time  difference  or  a negative  difference,  are  effective  in  aiding 
the  listener  in  determining  if  the  two  sequences  are  the  same  or  different.  Thus,  if  tu  and  t2i 
are  the  intertone  times  of  the  1 st  and  2nd  sequence,  respectively,  and  i is  the  ith  intertone 
time  of  the  sequence,  then 


20 


21 


If  the  perturbations  of  the  intertone  times,  tH  and  t2i  are  Gaussian  distributed,  then  0 <g<  +°°, 
In  understanding  how  listeners  might  use  the  cues  for  temporal  pattern  discrimination,  it  is 
helpful  to  describe  how  the  mean  and  variance  of  g change  with  the  perturbation  values  o,;. 
If 

(8)  Z-tx  -t2,then 

E0)=E(g-E(/2/) 

(9)  VAR(Z)=VAR(tx)  + VAR(t2) 

and  again  by  definition  (Pitman,  1993): 

(10)  VAR{7)  =E(Z2)  -(E(2))2 

oo 

(11)  E(Z)=  J ZezlndZ 

Let  each  distribution  of  intertone  times  be  normal  t-N~  (p,o;  2),  the  difference  distribution 
between  the  ith  intertone  time  from  the  first  and  second  sequence  is  jf-N~(0,2ati2) 

That  is. 


(12)  E(^)=0 


22 


(13)  VAR(ft)=2a2 


and  by  definition 


(14)  Of  )-(£«)) 2 


For  simplicity,  denote  g=\fj\ . We  first  derive  a simple  expression  that  relates  VAR  (g)  to 
E ( g ).  This  expression  is 


(15)  VAR[g)  = 2a  2 - E(g) 2 


The  distribution  of  interest  g has  a mean  that  is  easily  predicted  from  its  variance. 


PROOF  OF  EQ.  15: 

By  combining  Eqs.  12,  13,  and  14  we  have 


(16)  2o„.2  = E(/;2)-0 


From  10,  we  also  have 


(17)  VAR(g)  = E(g2)-V(g? 


23 

and  further,  because  mathematically  g2  =/,2  then: 

(18)  E(g2)  = E(f2)  = 2o2 

combining  17  and  18,  we  have: 

VAR(g)  = 2a 2 -E(g)2,  which  is  equation  15. 

Because  the  integral  definitions  of  E(g)  is  difficult  to  evaluate  and  cannot  be  rewritten 
in  non-integral  form  in  terms  of  the  perturbation  oti  by  itself,  using  computer  simulations, 
a polynomial  approximation  was  derived  for  VAR(g).  For  o„  from  5 to  200  ms  in  steps  of 
2 ms,  VAR(g)  was  determined  by  sampling  o„  and  calculating  oAf , and  VAR(g)  with  100,000 
samples  per  point.  The  resultant  points  were  fitted  with  a Matlab  sixth-order  polynomial 
routine  that  approximated  VAR(g)  as 

(19)  VAR(g)=E  ft  o', 

r-  o 


The  root-mean-square  (rms)  error  between  the  function  above  and  the  fitted  points 
was  less  than  0.3  ms.  The  coefficients  Pj  were, 

Po  = -9.084,  P,  = 4.713,  p2  = 0.145,  P3  = 0.027,  P4< -0.001,  P5<  0.001,  p6< -0.001. 
Given  this  polynomial  approximation,  both  VAR(g)  and  E(g)  can  be  calculated  by  applying 
Eq.  15.  This  relates  both,  VAR(g)  and  E(g),  to  o„  of  the  perturbations  introduced  by  the 
experimenter.  Figures  1 and  2,  respectively,  depict  these  relationships.  This  characterization 
of  the  density  functions  for  the  pattern  discrimination  task  is  helpful  in  understanding  how 


24 


Figure  1.  Variance  of  the  absolute  difference  distribution  [VAR(g)]  is  plotted  as  a function 
of  the  values  of  the  standard  deviation  of  the  intertone  times. 


25 


Figure  2.  The  mean  of  the  absolute  difference  distribution  [E(g)]  is  plotted  as  a function  of 
the  values  of  the  standard  deviation  of  the  intertone  times. 


26 


the  stimulus  cues  vary  with  magnitude  of  perturbations.  Note  especially  from  equations  15 
and  1 9 that  as  the  o of  perturbations  increases,  the  mean  of  the  desired  distribution,  E(g), 
also  increases.  This  makes  intuitive  sense.  If  we  consider  the  difference  distribution 
jf=t u~t2j  = At.  (not  its  absolute  value),  then  in  the  same-different  paradigm,  where  the  task 
is  to  detect  a change,  increasing  ofi  is  the  same  as  increasing  the  signal  magnitude.  This  is 
somewhat  analogous  to  how  increasing  the  rms  error  (voltage)  of  a noise  waveform  increases 
its  sound  pressure  level  and  therefore  its  detectability.  We  will  discuss  this  point  more  fully 
in  the  Discussion  section. 

In  the  previous  section,  some  of  the  characteristics  of  the  distributions  relevant  to 
temporal  pattern  discrimination  were  examined.  One  further  property  of  these  distributions 
which  is  important  in  better  understanding  the  change  in  discriminability  of  patterns  is 
considered  here.  Specifically,  the  index  of  detectability,  d',  is  related  to  the  standard 
deviation  of  perturbations.  This  function  will  be  useful  in  the  general  discussion  section, 
where  ideal  weights  and  listener  efficiency  in  weight  distribution  are  estimated  for  some  of 
the  experiments  described. 

Simulations  were  used  to  determine  Receiver  Operating  Characteristic  (ROC)  curves. 
The  noise  distribution,  g„,  was  simply  calculated  as  the  absolute  value  of  the  difference 
distribution  of  two  Gaussian  random  variables  with  zero  means  and  a standard  deviation  of 
15  ms;  this  value  was  selected  based  on  estimations  from  previous  work  in  this  area  (Sorkin, 
1990).  The  signal-plus-noise  distribution,  gs+n,  was  calculated  as  the  absolute  value  of  the 
difference  of  two  Gaussian  distributions  with  zero  mean  and  a standard  deviation  equal  to 
/ (lS2-^2),  where  o„2  is  the  variance  of  the  perturbation  introduced  by  the  experimenter. 


27 


From  each  distribution,  10,000  samples  were  drawn  and  the  proportion  of  samples  from  each 
distribution  was  calculated  at  increasing  values  of  a criterion  C,  at  2-ms  intervals.  Hits  and 
False  alarms  were  then  obtained  from  these  proportions.  This  approach  produced  an  ROC 
curve  for  one  value  of  oti.  The  ROC  curve  was  determined  for  19  values  of  q = 0,  1,  2,  3, 
4,  5,  10,  15,  20,  25,  30,  35,  40,  50,  60,  80,  100,  120,  140  ms.  For  each  curve,  the  area  under 
the  ROC  was  calculated;  this  area  is  equal  to  the  probability  of  correct  responses  in  a two- 
alternative  forced-choice  task.  From  the  probability  estimate,  d'  was  then  determined. 
Figure  3 plots  the  function  relating  d'  to  a,;.  The  star  symbols  are  the  predictions  of 
simulations  described  above,  fitted  (solid  line)  with  a power  function 

(20)  d-  (0.030^0.6. 

The  power-function  fit  does  a reasonable  job  in  describing  how  predicted  d’  varies  with  ati 
at  values  of  most  relevance  to  the  current  experiments,  that  is,  for  20^  oti  ^ 100  ms.  Clearly 
a better  fit  may  have  been  obtained  with  a higher-order  polynomial;  however,  to  keep  the 
expression  simple  and  because  the  fit  is  reasonable  for  the  range  of  interest,  we  will  use  this 
simple  power  function  in  relating  d’s  to  oti . It  should  be  noted  that  the  results  from  Sorkin 
(1990)  also  showed  that  listener's  performance  (d1)  improved  with  a positively  decelerating 
function  as  the  standard  deviation  of  perturbations  of  all  intertone  times  within  a pattern 
increased.  In  Sorkin’s  study,  d’  was  measured  as  the  perturbation  of  intertone  times  was 
uniformly  increased  across  all  components  of  a sequence  of  12  tones.  Later  in  the  general 
discussion,  we  will  return  to  the  functions  described  in  this  section  to  estimate  ideal  weights 
for  cases  where  ati  are  different  for  different  intertone  positions. 


28 


Standard  deviation  of  the 
intertone  time  (ms) 


Figure  3.  d'  is  plotted  as  a function  of  the  values  of  the  standard  deviation  of  the  intertone 
time.  The  star  symbols  represent  the  predictions  from  ROC  analysis.  The  solid  line 
represents  a power-function  fit  to  predictions  (Eq.  20). 


EXPERIMENT  I 


The  goal  of  this  experiment  was  to  determine  how  listeners  temporally  distribute  their 
attention  when  discriminating  between  two  temporal  patterns.  The  importance  of  each 
temporal  position  within  the  pattern  was  determined  by  estimating  the  decision  weight  given 
to  that  position.  This  experiment  also  examined  the  effect  of  similarity  between  two  sets  of 
temporal  envelopes  within  a sequence,  that  is,  the  effect  of  rhythmicity  on  the  discrimination 
of  the  temporal  patterns. 

For  each  temporal  position,  we  calculated  the  correlation  (on  DIFFERENT  trials) 
between  the  intertone  time  at  that  temporal  position  in  two  sequences,  |tu  - y and  the  listen- 
er’s response.  The  decision  weight  at  each  position  was  assumed  to  be  proportional  to  this 
correlation.  All  weights  were  normalized  to  sum  to  unity  (Sa,  = 1). 

Method 

Subjects 

Four  students  from  the  University  of  Florida,  one  female  and  three  males  with  normal 
hearing  (as  determined  from  self  report),  participated  in  this  experiment.  One  subject  (TL) 
had  prior  experience  with  the  task.  All  subjects  were  paid  an  hourly  wage  plus  a bonus  based 
on  performance. 


29 


30 


Apparatus  and  Stimuli 

Listeners  were  seated  in  a double-walled  acoustically  insulated  chamber.  The  stimuli 
were  presented  monaurally  via  TDH-39  headphones.  Listeners  had  to  complete  10  blocks 
in  each  session.  Each  block  consisted  of  100  trials.  All  independent  variables  and  conditions 
were  held  constant  within  a block.  Visual  feedback  about  the  correct  responses  was  provided 
after  each  trial. 

The  intertone  times  between  tones  were  generated  by  a process  that  enabled  control 
of  the  mean,  standard  deviation,  and  correlation  of  the  intertone  times  (following  Jefffes  and 
Robinson,  1962,  and  Licklider  and  Dzendolet,  1948).  The  intertone  times  were  generated  by 
combining  three,  independent,  normal  random  variables  ta,  tb)  and  tC)  where  pa=  pb=  0,  |ic  * 
0,  and  variances  aa=  ob  s ou.  The  two  sequences  of  intertone  times,  (tu,  t12,  tj  J and  (t 
2,i>  1 2,2 j t were  generated  by  determining  m values  for  tj  and  t2 . These  values,  t,  and 
t2 , were  determined  from  the  random  variables  , ta,  tb,  tc,  as  follows 

(21)  t,=ta  + tc 
and 

(22)  t2  = tb  + tc, 

where  E(t,)  = E(t2)  = pc 

and 

Var  (t,)  = Var  (t2)  = Var  (ta  + tc)  = ou2  + ac2. 


To  compute  the  correlation  between  the  sequences  t,  and  t2 , 


31 


(23)  p „,a  = [cov(t„t2)]  / otlot2 , 

(24)  cov^t^  E[(t,  - Pj)(t2  - p2)]=E[(ta  + tc)(tb  + tc)]  - pcE(tc)  - pcE(tc)  + pcpc=  o2 

(25)  P(/,t2  = °c2/  (ou2  + oc2). 


On  SAME  trials  of  the  experiment,  p,/(2  is  set  to  1.0  and  on  DIFFERENT  trials,  p/(2  is  set 
to  p^,  where  ex  denotes  experimental. 

The  variance  of  the  intertone  times  o2ex , is 

(26)  o2ra=  var  (t,)  = var  (t2)  = var  (ta  + tc)  = o\  + o2c. 

The  relation  between  the  sequences  is  given  by  the  ratio  of  the  variance  common  to  the  two 
sequences,  divided  by  the  sum  of  the  common  and  unique  variances: 

(27)  P ex  = oV {p\  + o2c)  = o2Jo2ex . 

This  expression  was  used  to  determine  the  correlation  between  two  sequences  in  the 
following  sections. 

Procedure 


On  a given  trial,  listeners  were  presented  with  two  sequences  of  tones,  each  composed 
of  nine  1000-Hz  tones  presented  at  71-dB  sound-pressure  level.  The  25-ms  tone  bursts  were 
shaped  by  a 4-ms  linear  rise  and  decay  envelopes.  An  interval  of  750-ms  separated  the  pair 


32 


of  tone  sequences.  After  listening  to  the  pair  of  sequences  presented  on  each  trial,  the 
subject  indicated  whether  or  not  the  temporal  patterns  of  the  tones  were  the  same  or  dif- 
ferent. The  type  of  trial  (SAME  or  DIFFERENT)  was  selected  on  a random  basis.  On  the 
SAME  trials,  the  temporal  patterns  were  perfectly  correlated  (sequence  pattern  correlation 
=1.0).  On  the  DIFFERENT  trials,  the  patterns  were  partially  correlated  (sequence  pattern 
correlation  <I,0).  Figure  4 displays  a sample  of  both,  SAME  and  DIFFERENT,  trials. 

There  were  two  conditions  in  this  experiment.  In  the  first  condition,  (the 
correlation  between  the  two  sequences)  was  fixed  at  either  0 or  0.8.  The  average  time 
interval  between  the  tone  onsets  was  50  ms  with  a standard  deviation  of  35  ms.  This  meant 
that  the  time  intervals  varied  from  15  to  85  ms  within  one  standard  deviation.  For  both 
conditions,  the  minimum  interval  between  tones  (offset  to  onset)  was  2 ms.  Although  this 
minimum  interval  has  some  effect  on  about  16%  of  the  trials,  the  correlation  weights  are 
immune  to  the  shape  of  the  distributions.  However,  this  makes  other  calculations  (e.g.,  the 
absolute  value  of  difference  distribution)  to  be  only  approximate.  The  correlations  were 
fixed  within  each  block  of  trials.  In  the  second  condition,  each  sequence  was  composed  of 
ten  tones  and  was  divided  into  two  parts  that  were  separated  by  a fixed  interval  of  50  ms. 
In  this  condition,  the  repetition  correlation  controlled  the  correlation  between  the  intertone 
times  of  the  two  parts  within  the  sequence.  The  higher  the  value  of  this  correlation,  the 
higher  the  repetition  among  the  two  subsequences.  For  example,  repetition  correlation  of 
0 represented  no  repetition  among  the  subsequences  within  the  sequence.  Whereas, 
repetition  correlation  of  1 meant  that  the  second  subsequence  was  identical  to  the  first 
subsequence.  Figure  5 depicts  a sample  of  SAME  and  DIFFERENT  trials  for  this  condition. 


33 


SAME 


I I 111  I III 
l-l  II  II  ill 

Time 


Sequence  1 
Sequence  2 


different 


jl  II  I III  I 
H I 1 1 II  II 


Sequence  1 
Sequence  2 


t=  Intertone  Time  Frequency  = 1000  Hz 


Tone  Duration  = 25  ms 


Figure  4.  The  top  part  of  this  figure  represents  a sample  of  SAME  trials  for  the  two 

sequences  of  a pattern.  The  bottom  part  represents  a sample  of  DIFFERENT  trials. 
The  frequency  and  duration  of  each  tone  was  1 000  Hz  and  25  ms  respectively. 


34 


SAME 


R. 


B 


II  II- 

H 

— 

■ ■ ■ ■ 

■ ■ 

| fl 

— 1 1 — 1 1 

■ ■ 

50  ms 

1 1 

Time 

DIFFERENT 

A 

B 

II  II 

III 

| 

■ 1 II 

III 

■ 

■ M 1— 

— m — 

1— 

S equence 
Sequence 


S equence 


Sequence 


50  ms 


t=  Intertone  Time 


Frequency  = X O O O Hz 


Tone  Duration  = 25  ms 


X 

2 


X 


2 


Figure  5.  In  this  figure,  each  sequence  is  divided  into  2 subsequences,  A and  B.  A 50  ms 
gap  separates  the  two  subsequences.  The  top  part  of  the  figure  represents  a sample 
of  SAME  trials  for  the  two  sequences  of  a pattern.  The  bottom  part  represents  a 
sample  of  DIFFERENT  trials. 


35 


Listeners  were  tested  with  the  repetition  correlation  of  0 and  0.85.  For  both  tested 
correlations,  the  sequence  correlation  was  fixed  at  0.  Listeners  ran  several  hours  of  practice 
trials  before  data  collection  was  begun.  No  practice  effect  was  evident  on  discrimination 
performance. 

Results 


Figure  6 presents  the  individual  data  obtained  for  each  listener  for  the  first  condition. 
The  abscissa  represents  the  temporal  position  of  each  intertone  time  and  the  ordinate  shows 
the  relative  weight  given  to  each  position.  When  the  correlation  between  the  two  sequences 
was  set  to  0,  the  results  showed  that  all  listeners  generally  gave  higher  weights  to  the  first 
position  compared  to  most  other  positions.  For  two  listeners,  TL  and  AR,  the  terminal 
positions  were  also  heavily  weighted.  Figure  7 presents  the  average  data  from  all  4 listeners 
for  this  case.  The  error  bars  represent  one  standard  error  of  the  mean.  An  ANOVA  test 
showed  a significant  effect  of  temporal  position  [F(3,7)=  5.27,  p<0.05].  When  the  sequence 
correlation  was  set  to  0.8,  generally,  there  was  not  a substantial  variation  among  listener's 
weighting  of  the  positions.  However,  the  first  position  seemed  to  still  carry  a somewhat 
larger  weight.  Figures  8 and  9,  respectively,  represent  the  individual  and  average  data  for 
the  0.8  condition.  An  ANOVA  test  [F  (3,7)=  2.21,  p<0.05]  on  the  data  of  Figure  9 did  not 
show  a significant  effect  of  position,  although  the  tendency  of  a higher  weight  for  the  first 
position  is  still  somewhat  evident. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


36 


Figure  6.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  The  sequence 
correlation  is  equal  to  0.  Each  panel  represents  data  obtained  from  one  listener. 


RELATIVE  WEIGHT 


37 


Figure  7.  Relative  weights,  averaged  from  4 listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  sequence  correlation  is  equal  to  0.  The  error  bars  represent 
one  standard  error  of  the  mean. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


38 


Figure  8.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  The  sequence 
correlation  is  equal  to  0.8.  Each  panel  represents  data  obtained  from  one  listener. 


RELATIVE  WEIGHT 


39 


0.50 


0.40 


0.30 


0.20 


0.10 


0.00 


SEQUENCE  CORRELATION  = 0.8 


i i i i i i i i 

01  23456789 


TEMPORAL  POSITION 


Figure  9.  Relative  weights,  averaged  from  4 listeners,  are  plotted  as  a function  of  each 

temporal  position.  The  sequence  correlation  is  equal  to  0.8.  The  error  bars  represent 
one  standard  error  of  the  mean. 


40 


The  data  of  the  first  condition  of  this  experiment  show  that  listeners  indeed  allocate  more 
attention  to  some  temporal  position  than  to  others.  These  results  indicated  that  the  listener's 
attention  is  substantially  directed  at  the  first  occurring  segment,  and  to  a lesser  degree,  at 
the  last  position  particularly  when  the  sequences  were  uncorrelated  and  to  a smaller  extent 
when  they  were  partially  correlated.  These  results  have  bearing  on  such  models  as  the 
temporal  pattern  correlation  model  proposed  by  Sorkin  (1990).  This  model  essentially 
suggests  that  a listener's  response,  when  discriminating  the  temporal  patterns,  is  based  on 
estimates  of  the  correlations  between  the  two  sets  of  intertone  times.  However,  it  is 
reasonable  to  assume  that  listeners  would  not  be  capable  of  efficiently  storing  all  the 
intertone  times  in  memory.  The  present  results  provide  evidence  suggesting  that  some 
positions  or  intertone  times  may  play  a more  important  role  in  the  listener's  estimate  of 
correlation  than  others. 

In  the  second  part  of  this  experiment,  where  the  sequences  were  divided  into  two 
subsequences,  and  both,  the  repetition  correlation  and  the  sequence  correlation,  were  set  to 
0,  it  was  again  the  first  position  that  generally  was  given  the  highest  weight.  It  should  be 
noted  that  the  50  ms  gap  between  the  two  subsequences  was  the  only  difference  between  this 
case  and  the  first  condition.  Figures  10  and  11,  respectively,  show  the  data  of  this  condition 
for  the  individual  listeners  and  the  average  data  from  these  4 listeners.  The  obtained  d's  in 
this  condition  were  high.  As  a result,  there  was  very  little  variability  between  perturbations 
of  intertone  times  and  response.  We  used  partial  correlations  to  estimate  the  latter  weights, 
due  to  the  existing  correlation  (repetition  correlation  of  0.85)  between  corresponding 
intertone  times  in  the  two  subsequences  within  each  sequence.  Because  of  the  high  d's,  these 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


41 


Figure  10.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Both,  the 
sequence  correlation  and  the  repetition  correlation  are  equal  to  0.  Each  panel 
represents  data  obtained  from  one  listener. 


RELATIVE  WEIGHT 


42 


TEMPORAL  POSITION 


Figure  1 1 . Relative  weights,  averaged  from  4 listeners,  are  plotted  as  a function  of  each 
temporal  position.  Both,  the  sequence  correlation  and  the  repetition  correlation  are 
equal  to  0.  The  error  bars  represent  one  standard  error  of  the  mean. 


43 


partial  correlations  were  noisy  and  difficult  to  measure.  Since  the  obtained  weights  derived 
from  these  partial  correlations  were  unreliable,  d'  was  used  to  depict  listeners’  performance 
in  this  condition.  Table  1 represent  the  individual  and  average  d's  for  the  second  case  of  this 
condition.  Results  did  not  show  an  improvement  in  the  listener's  performance,  due  to 
repetition  of  some  of  the  intertone  times.  This  set  of  results  was  not  in  agreement  with  the 
previous  findings  (e.g.,  Martin,  1972),  which  suggest  that  discrimination  of  temporal  patterns 
can  be  improved  greatly  by  repeating  the  patterns  that  would  give  an  impression  of 
rhythmicity  among  the  components.  In  order  to  be  fair,  it  should  be  noted  that  the  high  d’s 
observed,  may  have  made  the  potentially  helpful  effects  from  rhythmicity  difficult  to 
measure. 

The  two  conditions  of  this  experiment  provide  evidence  on  how  listeners  distribute 
their  attention  in  a temporal  pattern  recognition  task.  The  first  part  demonstrated  that  the 
distribution  of  attention  is  not  equal  among  all  the  presented  components.  These  results  were 
not  in  agreement  with  Watson  et  al.  (1975)  study,  that  suggested  the  latest-occurring 
information  was  the  most  important  temporal  location  of  the  sequence.  Note,  however,  that 
Watson  et  al.’s  patterns  were  defined  by  changes  in  both  frequency  and  duration  of  some 
tones  with  fixed  intertone  times,  whereas  the  patterns  described  here  are  defined  by  intertone 
times  using  fixed-frequency  tones.  Our  results  showed  that  the  earlier  occurring  information 
is  the  most  important  locus  of  information.  However,  some  listeners  also  showed  a tendency 
to  also  give  a large  weight  to  intertone  times  near  the  end  of  the  sequence.  This  occasional 
increase  in  weight  at  the  end  of  sequence  for  some  subjects  may  be  related  to  short-term 
memory  effects. 


44 


Table  1. 

d’  for  each  repetition  correlation  for  each  listener  (sequence  correlation  = 0). 


Subject 

Repetition  correlation^ 

Repetition  correlation=0.85 

AF 

2.55 

2.60 

AR 

2.89 

2.88 

DD 

3.45 

3.69 

TL 

3.46 

3.50 

Average 

3.09 

3.17 

EXPERIMENT  II 


The  results  of  the  previous  experiment  demonstrated  that  listeners  do  not  distribute 
their  attention  uniformly  among  intertone  times  of  equal  duration  , but  rather  give  more 
weight  to  certain  positions.  In  this  case,  the  intertone  times  were  all  statistically  equivalent. 
The  following  experiment  was  designed  to  investigate  if  a statistical  change  in  one  of  the 
intertone  times  would  result  in  a change  in  the  listener's  weighting  strategy.  Two  questions 
were  addressed.  First,  does  the  occurrence  of  one  long-duration  component  "command" 
attention  away  from  the  other  components?  Kidd  and  Watson  (1992)  have  recently  proposed 
the  "proportion-of-total-pattem-duration  (PTD)  rule"  to  describe  how  listeners  distribute 
attention  in  discrimination  of  tonal  patterns.  This  rule  essentially  suggests  that  each 
individual  element  of  an  unknown  sequence  of  tones  is  detected  with  an  accuracy  that  is 
related  to  its  proportion  of  duration  compared  to  the  total  duration  of  the  sequence.  By 
analyzing  the  listener's  weighting  strategy,  the  current  experiment  examines  the  effect  of 
making  one  component  more  perceptually  salient  than  the  others.  In  addition,  it  can  test  the 
predictions  of  the  PTD  rule.  In  the  first  part  of  this  experiment  one  position  (in  each  of  the 
patterns  to  be  compared)  was  assigned  intertone  duration  either  greater  or  less  than  the  mean 
intertone  duration  of  all  other  positions.  According  to  the  PTD  rule,  the  decision  weight 
given  by  the  listener  to  each  position  should  correspond  to  its  proportion  of  the  total  pattern 
duration;  that  is,  each  position  should  receive  a weight  proportional  to  its  relative  duration. 


45 


46 


The  second  part  of  this  experiment  investigated  whether  listeners  could  differentially  allocate 
attention  according  to  the  diagnosticity  or  magnitude  of  the  information.  In  this  part  of  the 
experiment,  the  variance  of  one  of  the  intertone  times  was  chosen  to  be  different  than  that  of 
the  others.  Recall  from  figure  2 that  an  increase  in  the  variance  of  an  intertone  time,  increases 
the  mean  of  the  distribution  g,  and  therefore  increases  the  detectability  of  that  intertone  time. 


Method 

Subjects 

Four  University  of  Florida  students,  two  males  and  two  females  with  normal  hearing 
participated  in  both  studies.  Two  subjects  had  participated  in  the  first  study  and  had  prior 
experience  with  the  task.  All  subjects  were  paid  an  hourly  wage  plus  a bonus  based  on 
performance. 

Apparatus  and  Stimuli 

The  same  apparatus  and  stimuli  as  the  first  experiment  were  used  in  this  experiment. 
Procedure 

On  a given  trial,  listeners  were  presented  with  two  sequences  of  tones,  each  composed 
of  nine  1000-Hz  tones  presented  at  71  -dB  sound-pressure  level.  The  tone  bursts  were  25  ms 
in  duration  and  were  shaped  by  4-ms  linear  rise  and  decay  envelopes.  An  interval  of  750  ms 
separated  the  pair  of  tone  sequences.  After  listening  to  the  pair  of  sequences  presented  on 


47 


each  trial,  the  subject  indicated  whether  or  not  the  temporal  patterns  of  the  tones  were  the 
same  or  different.  On  a random  half  of  the  trials,  the  temporal  patterns  were  perfectly 
correlated  (SAME  trials).  On  the  other  half  of  trials  (DIFFERENT  trials),  patterns  were  not 
correlated  (p^  was  fixed  at  0). 

In  the  control  condition  of  this  experiment,  the  mean  and  standard  deviation  for  all 
the  components  within  the  sequences  were  60  and  20  ms  respectively.  Figure  12  represents 
a sample  of  SAME-DIFFERENT  trials  for  this  control  condition.  This  control  condition  is 
the  same  as  the  first  part  of  Experiment  I,  however  since  different  means  and  standard 
deviations  as  well  as  different  subjects  were  used  in  this  experiment,  we  repeated  this 
condition  in  Experiment  II. 

For  the  first  case,  one  of  the  intertone  times  had  a different  (either  higher  or  lower) 
mean  duration  either  at  an  early  position  (2nd)  or  at  a later  position  (6th)  within  the 
sequence.  The  different  mean  intertone-time  values  ranged  from  20  to  100  ms,  in  steps  of 
20  ms;  during  a block  of  trials,  this  mean  value  was  fixed.  The  reason  that  these  values  were 
chosen  was  that  they  are  comparable  to  some  of  the  values  observed  in  voice-onset-time 
(VOT)  discrimination.  For  example,  it's  been  shown  that  the  discrimination  between  /ba/ 
and  /pa/  requires  attention  to  the  duration  of  events  occurring  within  the  first  60-100  ms  of 
a syllable  that  is  itself  300-400  ms  in  duration.  It  should  be  noted  that  the  mean  and  standard 
deviation  for  all  the  other  positions  within  the  sequences  were  kept  at  60  and  20  ms  respec- 
tively. While  the  mean  values  varied,  the  standard  deviation  of  all  intertone  times  was 
constant.  Figure  13  represents  the  summary  of  the  trial  types  for  this  case. 


48 


SAME 


1 1 ||  | | 1 | 1 Sequence  1 

| 1 — 1 1 — | | 1 — | 1 Sequence  2 

Time 


DIFFERENT 


Frequency  = 1000  Hz 
Tone  Duration  = 25  ms 


— | — I | | — | 1 — | 1 1 Sequence  1 

tri 

I-  I — I 1 I — H 1 1 Sequence  2 

*2,1 

t = intertone  time 

Mean  (All  positions)  = 60  ms 
Std.  Dev.  (All  positions)  = 20  ms 

Figure  12.  The  upper  part  of  this  figure  represents  a sample  of  the  SAME  trials,  while  the 
lower  part  represents  the  DIFFERENT  trials  for  the  control  condition  of  Experiment 
2. 


49 


SAME 


I — i ll  II  Ml 
I III  1 1 III 

Time 


Sequence  1 
Sequence  2 


DIFFERENT 


Frequency  = 1000  Hz 
Tone  Duration  = 25  ms 


~H~H  H-H- 1 

I I I II  II  II 

^2,1 

t = intertone  time 

Mean  (Positions  2 or  6)  = 20,  40, 


Sequence  1 
Sequence  2 


80,  100  ms 


Mean  (All  positions)  = 60  ms 
Std.  Dev.  (All  positions)  = 20  ms 


Figure  13.  This  figure  represents  a summary  of  the  stimuli  used  in  the  first  case  of 

Experiment  2.  In  this  case,  the  second  or  sixth  intertone  time  has  a different  mean 
duration  than  all  others.  The  standard  deviation  is  the  same  for  all  positions. 


50 


For  the  second  case,  one  of  the  intertone  times  had  a higher  standard-deviation  value, 
which  occurred,  again,  either  at  the  2nd  position  or  at  the  6th  position  within  the  sequence. 
The  mean  and  standard  deviation  for  all  other  positions  were  set  to  60  and  20  ms 
respectively,  except  when  the  standard  deviation  value  was  100  ms.  In  this  case,  the  mean 
for  all  the  positions  was  set  to  100  ms.  The  different  standard  deviation  values  were  40,  60, 
100  ms.  The  results  from  this  experiment  should  indicate  if  listeners  are  able  to  change  their 
weighting  strategy  based  on  the  reliability  or  magnitude  of  the  given  information.  Figure 
14  represents  the  summary  of  stimulus  configuration  and  parameters  for  this  case. 

For  both  cases  of  this  study,  the  minimum  interval  between  tones  (offset  to  onset)  was 
2 ms  and  the  sequence  correlation  was  fixed  within  each  block  of  trials.  Listeners  ran 
several  hours  of  practice  trials  before  the  data  collection  began  and  no  practice  effects  were 
subsequently  observed. 

Results 


Figure  15  and  16  depict  the  individual  and  average  data,  respectively,  from  the  four 
listeners  in  the  control  condition.  For  all  figures,  the  abscissa  represents  the  temporal 
position  of  each  intertone  time,  while  the  ordinate  represents  the  relative  weight  given  to 
each  temporal  position.  The  first  position,  on  average,  is  given  the  highest  weight  by  these 
listeners.  Note  the  similarity  between  these  data  and  those  from  figure  7.  Data  from  the  4 
listeners  for  case  1 of  this  experiment  (where  one  of  the  intertone  times  had  a different 
mean)  are  plotted  in  figures  17  A-D  and  18  A-D.  Figures  17  A-D  represent  the  individual 


SAME 


51 


| HH|  | || 

l-l  II  1 1 -l-H 

Time 


Sequence  1 
Sequence  2 


DIFFERENT 


Frequency  = 1000  Hz 
Tone  Duration  = 25  ms 


i in i mi  i 
1 1 1 n M M 


Sequence  1 
Sequence  2 


t = intertone  time 

Std.  Dev.  (Positions  2 or  6)  = 40,  80, 100  ms 

Mean  (All  positions)  = 60  ms 
Std.  Dev.  (All  positions)  = 20  ms 


Figure  14.  This  figure  represents  a summary  of  the  stimuli  used  in  the  second  case  of 

Experiment  2.  In  this  case,  the  second  or  sixth  intertone  time  has  a different  standard 
deviation  than  all  others.  The  mean  duration  is  the  same  for  all  positions. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


52 


Figure  15.  Control  condition  for  Experiment  2.  Relative  weights  are  plotted  as  a function 
of  each  temporal  position.  Each  panel  represents  the  obtained  data  for  each 
respective  listener.  The  mean  and  standard  deviation,  for  each  position,  were  60  and 
20  ms  respectively. 


RELATIVE  WEIGHT 


53 


Figure  16.  The  average  relative  weights,  from  4 listeners,  are  plotted  as  a function  of  each 
temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  and  standard  deviation,  for  each  position,  were  60  and  20  ms  respectively. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


54 


Figure  17  A.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  second 
position  was  20  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


55 


Figure  17  B.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  second 
position  was  40  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


56 


Figure  17  C.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  second 
position  was  80  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


57 


x 

o 

u 

3s 

> 

§ 

Ld 

cr 


Figure  17  D.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  second 
position  was  100  ms. 


RELATIVE  WEIGHT 


58 


0.50 

0.45 

0.40 

0.55 

0.50 

0.25 

0.20 

0.15 

0.10 

0.05 

0.00 


TEMPORAL  POSITION 


Figure  18  A.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  second  position  was  20  ms. 


RELATIVE  WEIGHT 


59 


0.50 

0.45 

0.40 

0.35 

0.30 

0.25 

0.20 

0.15 

0.10 

0.05 

0.00 


TEMPORAL  POSITION 


Figure  18  B.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  second  position  was  40  ms. 


RELATIVE  WEIGHT 


60 


0.50 

0.45 

0.40 

0.35 

0.30 

0.25 

0.20 

0.15 

0.10 

0.05 

0.00 


TEMPORAL  POSITION 


Figure  18  C.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  second  position  was  80  ms. 


RELATIVE  WEIGHT 


61 


Figure  18  D.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  second  position  was  100  ms. 


62 


data  for  the  2nd  position,  for  all  the  tested  values,  where  the  standard  deviations  were  all 
equal  and  fixed  while  the  mean  intertone  times  of  the  second  position  varied;  20,  40,  80,  100 
ms  for  figures  1 8 A-D  respectively.  Figures  1 8 A-D  represents  the  average  data  obtained 
for  the  different  means  at  the  second  position.  Figures  19  A-D  and  20  A-D  represent  the 
individual  and  average  data  for  the  case  where  the  6th  position  was  examined;  again,  the 
mean  intertone  times  were  all  60  ms  except  for  position  6.  Figures  21  and  22  represent  the 
average  data  from  these  listeners  compared  to  the  control  condition.  An  ANOVA  test 
showed  no  significant  effect  of  a different  mean  intertone  time  at  the  second  position 
[F(3,4)=  2.24,  p<0.05]  or  the  sixth  position  [F(3,4)=  1.00  , p<0.05],  although  the  individual 
data  in  some  cases  do  show  a small  tendency  for  a larger  than  usual  weight  given  to  the 
position  with  a different  mean. 

There  are  2 clear  findings  on  the  effects  of  mean  intertone  time  on  the  discrimination 
of  auditory  temporal  patterns.  First,  listeners'  weights  are,  generally,  highest  for  the  first 
temporal  position,  regardless  of  mean  intertone  times  (20- 1 00  ms)  at  either  of  the  2nd  or  6th 
temporal  positions.  Second,  the  listeners'  weights  seem  independent  of  the  proportion  of  the 
total  pattern  duration  that  a temporal  position  occupies  within  the  sequence.  This  can  be 
observed  when  the  decision  weight  is  plotted  as  a function  of  the  different  mean  intertone 
times.  (Figures  21  and  22).  These  results  show  that  a segment  with  a different  duration  will 
not  capture  the  listener's  attention  as  suggested  by  the  PTD  rule. 

For  case  2,  all  listeners  consistently  gave  higher  weights  to  the  temporal  position  that 
had  the  higher  standard  deviation,  regardless  of  where  the  change  occurred  within  the 
sequence  (2nd  or  6th  position).  This  was  true  of  all  the  tested  standard-deviation  values. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


63 


Figure  19  A.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  sixth  position 
was  20  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


64 


TEMPORAL  POSITION 


Figure  19  B.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  sixth  position 
was  40  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


65 


Figure  19  C.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  sixth  position 
was  80  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


66 


Figure  19  D.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  mean  duration  for  the  sixth  position 
was  100  ms. 


RELATIVE  WEIGHT 


67 


TEMPORAL  POSITION 


Figure  20  A.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  sixth  position  was  20  ms. 


RELATIVE  WEIGHT 


68 


0.50 

0.45 

0.40 

0.35 

0.30 

0.25 

0.20 

0.15 

0.10 

0.05 

0.00 


TEMPORAL  POSITION 


Figure  20  B.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  sixth  position  was  40  ms. 


RELATIVE  WEIGHT 


69 


TEMPORAL  POSITION 


Figure  20  C.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  sixth  position  was  80  ms. 


RELATIVE  WEIGHT 


70 


TEMPORAL  POSITION 


Figure  20  D.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
mean  duration  for  the  sixth  position  was  100  ms. 


RELATIVE  WEIGHT 


71 


Figure  21 . This  figure  depicts  average  relative  weights  as  a function  of  the  different  mean 
values  for  the  second  temporal  position. 


RELATIVE  WEIGHT 


72 


Figure  22.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  mean 
values  for  the  sixth  temporal  position. 


73 


Figures  23  A-C  and  24  A-C  represent  the  individual  and  average  results  for  the  tested 
values  for  the  2nd  position  respectively.  Figures  25  A-C  and  26  A-C  represent  individual 
and  average  results  for  the  6th  position.  Figures  27  and  28  represent  the  summary  data  for 
this  part  of  the  experiment.  These  figures  depict  relative  weights  as  a function  of  the 
standard-deviation  values  compared  to  the  control  condition.  They  demonstrate  that  the 
higher  the  value  of  the  standard  deviation,  the  larger  the  weight  that  was  given  to  that 
position  by  the  listeners.  An  ANOVA  showed  a significant  effect  of  changing  the  standard 
deviation  of  intertone  times  at  both  positions;  2 [F(3,3)=  15.7,  p<0.05]  and  6 [F(3,3)=  16.75, 

p<0.01]. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


74 


Figure  23  A.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  second 
position  was  40  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


75 


Figure  23  B.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  second 
position  was  60  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


76 


Figure  23  C.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  second 
position  was  100  ms. 


RELATIVE  WEIGHT 


77 


TEMPORAL  POSITION 


Figure  24  A.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  second  position  was  40  ms. 


RELATIVE  WEIGHT 


78 


TEMPORAL  POSITION 


Figure  24  B.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  second  position  was  60  ms. 


RELATIVE  WEIGHT 


79 


TEMPORAL  POSITION 


Figure  24  C.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  second  position  was  100  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


80 


TEMPORAL  POSITION 


Figure  25  A.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  sixth 
position  was  40  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


81 


Figure  25  B.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  sixth 
position  was  60  ms. 


RELATIVE  WEIGHT  RELATIVE  WEIGHT 


82 


Figure  25  C.  Relative  weights  are  plotted  as  a function  of  each  temporal  position.  Each 
panel  represents  the  data  from  one  listener.  The  standard  deviation  for  the  sixth 
position  was  100  ms. 


RELATIVE  WEIGHT 


83 


0.50 


0.40 


0.30 


0.20 


0.10 


TEMPORAL  POSITION 


Figure  26  A.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  sixth  position  was  40  ms. 


RELATIVE  WEIGHT 


84 


TEMPORAL  POSITION 


Figure  26  B.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  sixth  position  was  60  ms. 


RELATIVE  WEIGHT 


85 


Figure  26  C.  The  average  relative  weights,  from  all  listeners,  are  plotted  as  a function  of 
each  temporal  position.  The  error  bars  represent  one  standard  error  of  the  mean.  The 
standard  deviation  for  the  sixth  position  was  100  ms. 


Relative  Weight 


86 


0.50 
0.40 
0.30 
0.20 
0.10 
0.00 

Standard  deviation  of  the  intertone 
time  of  2nd  position  (ms) 


t 


Control 

0 20  40  60  80  TOO  T20 


Figure  27.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  standard 
deviation  values  for  the  second  temporal  position. 


RELATIVE  WEIGHT 


87 


0.50 


0.40 


0.30 


0.20 


0.10 


0.00 


t 

Control 

0 20  40  60  80  100  120 

Standard  deviation  of  the  intertone 
time  of  6th  position  (ms) 


Figure  28.  This  figure  depicts  average  relative  weights  as  a function  of  the  different  standard 
deviation  values  for  the  sixth  temporal  position. 


EXPERIMENT  III 


The  goal  of  this  experiment  was  to  examine  the  effect  of  presenting  the  individual 
tones  of  each  sequence  to  different  ears.  This  experiment,  specifically,  investigated  whether 
a temporal  sequence  can  be  encoded  independent  of  which  ear  had  received  each  part  of  the 
information.  It  should  be  noted  that  depending  on  the  particular  binaural  configuration  of  the 
stimulus,  presenting  the  auditory  information  through  different  channels  may  also  affect  the 
listener's  attentional  strategies  as  described  below. 

Method 

Subjects  and  Apparatus 

The  same  four  subjects  that  had  participated  in  the  second  experiment  were  employed 
in  this  experiment.  The  apparatus  were  also  the  same. 

Procedure 

On  a given  trial,  listeners  were  presented  with  two  sequences  of  tones.  Each  sequence 
was  composed  of  nine  1000-Hz  tone  presented  at  71-  dB  sound-pressure  level.  The  25-ms 
tone  bursts  were  shaped  by  4-ms  linear  rise  and  decay  envelopes.  An  interval  of  750-ms 
separated  the  pair  of  tone  sequences.  Four  conditions  were  examined  in  this  experiment, 
random-mode  binaural,  monaural,  alternating  binaural,  and  alternating  monaural. 


88 


89 


In  the  random-mode  binaural  condition,  each  individual  tone  of  each  sequence  was 
presented  randomly  either  to  the  right  or  left  ear.  After  listening  to  the  pair  of  sequences 
presented  on  each  trial,  the  subjects  indicated  whether  or  not  the  temporal  patterns  of  the 
tones  were  the  SAME  or  DIFFERENT.  Half  of  the  trials,  on  a random  basis,  were  SAME 
trials  and  the  other  half  were  DIFFERENT  trials.  The  sequence  pattern  correlation  was  set 
to  0.  Figure  29  represents  a sample  of  both  same  and  different  trials  for  this  condition. 

In  the  alternating  binaural  condition,  the  tones  of  each  sequence  were  presented 
alternately  to  each  ear.  For  instance,  tones  1,  3,  5,  7,  9 were  presented  to  the  left  ear,  while 
tones  2,  4,  6,  8 were  presented  to  the  right  ear.  This  meant  that  listeners  would  be  able  to 
expect  or  anticipate  the  channel  to  which  the  tones  were  presented  and  perhaps  facilitate  the 
detection  of  intertone  times.  All  other  aspects  of  this  condition  were  identical  to  the  random- 
mode binaural  condition. 

In  the  monaural  condition,  the  tones  of  both  sequences  were  presented  to  only  one  ear. 
All  other  aspects  of  this  condition  were  identical  to  the  random-mode  binaural  condition. 
For  the  alternating  monaural,  the  stimuli  were  taken  from  the  alternating  binaural  case,  but 
listeners  were  only  using  the  left  earphone  (right  earphone  was  deactivated).  This  meant  that 
they  were  given  only  half  the  sequence  information.  However,  the  half  in  the  left  ear  was 
temporally  consistent  in  both  sequences  of  the  SAME  trials,  albeit  with  fewer  tones  and 
longer  intertone  times.  The  purpose  for  this  latter  condition  will  soon  become  evident, 
however,  it  should  be  said  here  that  it  served  as  a control  test  for  the  alternate  binaural 
condition.  For  all  conditions  of  this  study,  the  mean  and  standard  deviation  of  the  intertone 
times  were  set  to  60  and  20  ms  respectively. 


90 


SAME 


Eight  Ear 
Lett  Ear 


I 1 I 1 

III 1 I 


Sequence  1 


Right  Ear 


Segeence  2 


Left  Ear 


I 1 1 I 


DIFFERENT 


Right  Ear 


Left  Ear 


ff 


-l.i 


Sequence  1 


I I 1 I I 


-2 . 1 


Right  Ear 


Left  Ear 


HI 


Hh 


I 1 


'2 . 1 


t=  Intertone  Time 


F requency  = lOOO  Hz 
Tone  Deration  = 25  ms 


Figure  29.  The  top  portion  represents  a sample  of  SAME  trials  for  the  two  sequences  of  a 
pattern  in  the  random-mode  binaural  condition.  The  bottom  portion  represents  a 
sample  of  DIFFERENT  trials  in  this  condition.  The  frequency  and  duration  of  each 
tone  was  1000  Hz  and  25  ms  respectively. 


91 


Results 

Figure  30  presents  the  individual  data  for  all  the  conditions  of  this  experiment.  The 
abscissa  represents  the  different  conditions  and  the  ordinate  represents  the  percent  correct 
discrimination.  Figure  31  represents  the  average  performance,  from  all  4 listeners.  The 
average  percent-correct  values  for  random-mode  binaural,  monaural,  alternating  binaural, 
and  alternating  monaural  condition  were  55.2,  86.2,  79.2  and  81.7  respectively.  The  error 
bars  show  one  standard  error  of  the  mean.  An  ANOVA  test  performed  on  the  average 
percent  correct  showed  a significant  main  effect  of  condition  [F(3,3)=  53.45,  P<0.01],  It 
should  be  noted  that  performance  in  the  random-mode  binaural  condition  was  at  chance.  At 
this  low  value,  it  is  difficult  to  measure  correlation  weights  and,  consequently,  it  was  decided 
to  use  only  a percent-correct  measure  for  performance. 

In  the  alternating  binaural  condition,  performance  was  substantially  better  than  the 
random-mode  binaural,  and  nearly  as  good  as  the  monaural  condition.  There  are  two 
possible  explanations  for  this.  First,  performance  in  the  strict  alternating  binaural  case  may 
have  been  better  because  the  listeners  expected  the  ear  to  which  each  tone  was  presented. 
Watson  et  al.  (1986)  have  shown  that  reduction  in  uncertainty  facilitates  the  discrimination 
of  auditory  patterns.  The  second  explanation  may  be  that  listeners,  in  the  alternating 
binaural  condition,  simply  focused  on  the  pattern  of  information  in  one  ear  and  ignored  what 
occurred  in  the  other  ear.  This  attentional  strategy  essentially  leads  to  a discrimination  of 
a monaural  sequence.  Because  the  tones  are  alternating  and  not  random,  the  perceived 
monaural  sequence  is  consistent  in  both  intervals  when  SAME  trials  are  used.  An  ideal 
detector  that  makes  use  of  only  one  channel  would  perform  perfectly  if  there  were  no  other 


■J3U3JSII  3U0  UIOJJ  spp  3qi  sju3S3jdai  ]3ued  qo^g  P3JJ03  jusojsd  sqj 
SJU3S3jd3J  SIBUipjO  3qj  pUB  suoiqpuoo  jE}U3UIU3dx3  sqj  SJU3S3id3J  BSSIOSqB  3qx  0£  3-inSlJ 


Buipuja^iv  IDjnDuiq 

spool 

IDjnouom  |DjnDUiq  |Djnouooi  -uiopuoy 


6uqDUja}|v  |DjnDUiq 

apouj 

IDjnououj  |DJnDuiq  |DjnDuouj  -uiopuoy 


6uipuja>iv  |DJnDUjq 

apoui 

IDJnouoai  IDjnDuiq  iDjnouoo  -oiopuoy 


"D 

m 

73 

o 

m 


O 

o 

73 

73 

m 

o 


S3 


6uipujs;iv  |Djnouiq 

apoai 

IDjnDuooi  IDjnDuiq  |DjnDuoai  -oiopuDy 


16 


PERCENT  CORRECT  {%)  PERCENT  CORRECT  (*) 


PERCENT  CORRECT  (*) 


93 


Random-  monaural  binaural  monaural 
mode 

binaural  Alternating 


Figure  3 1 . Average  percent  correct  is  plotted  as  a function  of  the  experimental  conditions. 
The  error  bars  represent  one  standard  error  of  the  mean. 


94 


limitations.  To  test  these  ideas,  the  alternating  binaural  condition  was  used  and  one  ear  piece 
(right)  was  deactivated.  Because  in  this  case,  the  resultant  monaural  pattern  has  half  as  many 
intervals,  that  are,  on  the  average,  twice  as  long,  it  was  not  clear  a priori  how  a human 
observer  would  perform.  Results  were  substantially  better  than  chance  and  nearly  identical 
to  the  strict  alternating  binaural  condition  (Figure  31).  Though  a contribution  from  a 
reduction  in  uncertainty  is  not  discounted  here,  much  of  the  improvement  in  the  alternating 
binaural  condition  may  be  accounted  for  by  monaural  pattern  discrimination. 


GENERAL  DISCUSSION 


The  results  from  the  first  part  of  Experiment  I indicate  that  the  temporal  location  of 
intertone-time  information  is  important  in  pattern  recognition  and  that  listeners  do  not  dis- 
tribute attention  equally  among  intertone  times.  This  was  evident  by  the  obtained  larger 
weights  for  the  first  temporal  location.  This  result  is  not  in  agreement  with  the  previous 
findings  by  Watson  et  al.  (1975)  who  suggested  that  the  latest-occurring  positions  are  the 
most  important  in  the  discrimination  of  the  temporal  sequences. 

However,  there  are  important  differences  between  this  experiment  and  the  Watson  et 
al.  study  (1975)  that  may  partially  account  for  the  divergent  results.  First,  in  the  Watson  et 
al.  experiments,  the  sequences  consisted  of  ten  tones  of  40-ms  duration  that  ranged  from 
256-1500  Hz.  The  listener's  task  was  to  detect  changes  in  the  frequency  of  only  one  tonal 
component  in  the  pattern  using  a same-different  task.  The  frequency  of  these  tones  varied 
randomly  within  a sequence  and  the  sequences  were  presented  without  any  intervals  between 
them.  However  in  our  experiment,  while  the  frequency  of  the  tones  was  kept  constant,  the 
time  intervals  between  the  tones  of  both  set  of  sequences  was  varied.  Second,  although  their 
results  indicated  that  a change  in  frequency  of  one  of  the  tonal  components  was  most  easily 
detected  when  that  change  occurred  at  the  end  of  the  pattern,  it  is  interesting  to  note  that 
when  a silent  interval  was  inserted  following  an  early-occurring  component  in  the  sequence, 


95 


96 


performance  became  similar  to  when  the  change  had  occurred  towards  the  end  of  the 
sequence. 

The  second  part  of  Experiment  I showed  that  repeating  the  auditory  patterns  did  not 
really  influence  the  listeners'  performances.  One  explanation  for  this  result  may  be  that  the 
extent  of  repetition  that  was  selected  (repetition  correlation  of  0.85),  might  not  have  been 
quite  sufficient  for  the  subjective  grouping  of  repeating  elements  that  results  in  an 
impression  of  rhythmicity  of  the  information  for  the  listeners.  Also,  it  has  been  shown  that 
there  are  factors,  other  than  repetition  among  components,  that  are  necessary  for  subjective 
rhythmic  impression.  For  instance,  Robin  et  al.  (1987)  suggested  that  the  rate  of  presentation 
plays  an  important  role  for  the  perception  of  rhythmic  organization.  They  proposed  that  the 
faster  the  rate  of  presentation  of  the  elements,  the  more  the  elements  are  perceived  as  being 
rhythmic.  This  was  also  observed  by  Monahan  and  Hirsh  (1990). 

Results  from  case  1 of  Experiment  II  suggest  that  when  one  of  the  temporal  positions 
in  each  pattern  is  made  different  (on  average)  than  the  others,  listeners  generally  do  not 
assign  more  weight  to  that  position.  The  first  temporal  position  still  receives  the  largest 
weight.  The  influence  of  the  first  position  in  discrimination  of  the  temporal  pattern  seems 
to  be  robust.  The  fact  that  the  different  mean  duration  for  one  of  the  temporal  positions  did 
not  affect  the  listener's  decision  strategy  should  not  be  surprising  since  making  the  duration 
of  one  position  different  from  others  does  not  provide  the  listener  with  any  more  diagnostic 
information  about  the  discriminability  of  the  two  patterns.  Note  from  the  density  functions 
described  earlier  (pages  18-25)  that  changing  the  mean  duration  of  components  does  not 
affect  the  characteristics  of  the  difference  distribution  or  its  absolute  value  (see  EQ.  1 5).  The 


97 


mean  of  the  difference  distribution  is  zero  independent  of  the  mean  duration  of  intertone 
times.  The  only  case  in  which  a change  in  mean  intertone  time  may  have  affected  the 
discrimination  of  the  position  may  have  been  because  of  Weber's  law.  A larger  average 
intertone  time  may  have  required  a larger  perturbation  for  the  same  level  of  performance, 
however,  within  the  bounds  of  the  durations  tested  here,  this  was  not  the  case  and  changes 
in  mean  duration  of  intertone  times  had  very  little  effect.  These  observations  may  also 
suggest  why  the  predictions  of  the  PTD  rule  (Kidd  & Watson,  1992)  might  be  limited  and 
not  applicable  to  a temporal  pattern-discrimination  task  such  as  that  employed  in  the  current 
experiments. 

Case  2 of  Experiment  II  demonstrated  that  when  one  of  the  temporal  positions  had 
a more  variable  intertone  time  than  the  others,  listeners  did  allocate  substantially  more 
attention  to  that  position.  This  finding  should  not  be  unexpected,  since  altering  the  variance 
of  the  intertone  time  provides  the  listener  with  additional  diagnostic  information  about  the 
pattern.  Recall  from  figure  2 that  changing  the  variance  of  a component  results  in  changing 
the  mean  of  the  absolute  value  of  the  difference  distribution  (distribution  g).  That  is  why  the 
information  may  be  considered  as  more  diagnostic  when  the  variance  of  one  of  the  intertone 
times  is  increased.  This  diagnosticity  of  information  affects  the  attentional  strategy 
employed  by  listeners,  which,  in  turn,  may  result  in  an  improvement  in  performance  and 
weight  given  to  that  component.  This  improvement  is  also  confirmed  by  the  higher  average 
d'  values  obtained  for  this  case  of  the  experiment. 


98 


Binaural  Pattern  Discrimination  and  Some  Implications  for  Possible  Neural  Mechanism 

The  results  from  Experiment  III  indicated  that  the  pattern  has  to  be  primarily  coded 
through  monaural  channels,  since  performance  in  the  monaural  conditions  was  superior  to 
the  binaural  conditions.  The  main  ideas  of  Sorkin’s  (1990)  pattern  correlation  model  were 
adapted  to  propose  a very  simple  model  here.  This  model  is  able  to  predict  much  of  the  data 
observed  in  this  experiment.  One  main  assumption  of  the  pattern  correlation  model  is  that 
listeners  are  able  to  discriminate  the  similarity  or  differences  in  patterns  by  estimating  the 
correlation  between  the  two  sets  of  intertone  times.  It  is  also  assumed  that  this  correlation 
is  corrupted  by  internal  noise.  According  to  this  model,  when  both  sequences  are  presented 
to  one  channel  (e.g.  left  ear)  additive  internal  noise  corrupts,  independently,  each  temporal 
component  of  the  sequence.  A correlation  is  then  calculated  and  compared  to  some  arbitrary 
criterion  and  a decision  is  made,  either  SAME  or  DIFFERENT.  Clearly,  even  if  the  trial  was 
SAME  and  the  stimulus  correlation  was  1.0,  the  listener  may  perform  at  less  than  perfect 
detection,  because  of  the  limiting  internal  noise.  Adapting  these  ideas  of  the  pattern 
correlation  model  to  when  the  sequence  is  presented  to  both  ears  results  in  the  binaural 
correlation  model.  One  assumption  of  the  binaural  correlation  model  is  that  the  listener  is 
able  to  calculate  three  correlations,  one  for  the  left  ear,  one  for  the  right  ear,  and  one 
binaurally.  The  model  further  assumes  that  each  sequence  that  gives  rise  to  each  of  these 
three  correlations  is  independently  corrupted  by  noise  before  the  correlation  is  computed. 

Let  Nb  , N,  , denote  the  noise  that  limits  the  calculation  of  correlations  for  the 
binaural,  left  ear,  and  right  ear  respectively.  It  is  assumed  that  these  noises  are  Gaussian 


99 


distributed  with  zero  mean,  equal  variance  for  the  left  and  right  ears,  o2  o\  , and  a binaural 
noise  that  is  much  larger  than  the  monaural  noise,  o\»o2T . Denote  the  Pearson  product- 
moment  correlation  coefficient  associated  with  each  of  these  cases  as  rb,  rh  rr  respectively. 
Computer  simulations  were  used  to  evaluate  the  predictions  of  the  model.  The  parameters 
of  the  monaural  noise  were  N,=  N = normal-  (0,122)  where  the  standard  deviation  is  in  ms. 
This  choice  of  standard  deviation  is  very  near  to  the  value  of  15  ms  used  by  Sorlcin  (1990) 
and  was  selected  after  some  pilot  tests.  The  binaural  noise  was  Nb=  normal~(0,502).  For 
each  condition,  10000  trials  were  simulated.  Each  trial  was  either  SAME  or  DIFFERENT 
with  equal  a priori  probabilities.  Eight  intertone  times  were  selected  on  each  trial  with 
parameters  equal  to  those  of  Experiment  III,  that  is  with  a mean  intertone  time  of  60  ms  and 
a standard  deviation  of  20  ms.  Each  intertone  time  was  then  corrupted  with  additive  internal 
noise  as  defined  above,  and  the  correlation  between  the  two  sequences  was  calculated.  This 
correlation  was  then  compared  to  a criterion,  chosen  at  0.55  to  produce  a high  probability 
of  correct  response,  and  a decision  made. 

According  to  the  binaural  correlation  model,  it  is  assumed  that  when  the  observer  is 
faced  with  more  than  one  correlation  value,  s/he  would  pick  the  largest  of  the  three  values 
and  compare  that  with  the  criterion.  This  means  that  the  decision  variable  D is 

(28)  D=  SAME  iff  max  |rb , r, , rr|  > C. 

Figure  32  shows  the  replot  of  the  data  from  Experiment  III  together  with  the  results  from 
simulations  according  to  the  above  decision  rule.  There  is  excellent  agreement  between  the 


PERCENT  CORRECT  (*) 


100 


mode 

binaural  Alternating 


Figure  32.  Average  percent  correct,  for  4 listeners,  is  plotted  as  a function  of  the 

experimental  conditions.  The  solid  line  represents  the  obtained  data,  and  the  dashed 
line  represents  the  model’s  prediction. 


101 


predictions  of  this  model  and  the  data.  The  random-mode  binaural  case  produced  poor 
performance,  simply  because  the  monaural-channel  correlations  are  very  small  due  to  the 
randomization  of  channels,  and  because  when  the  sequences  are  combined,  the  correlations 
are  still  poor  due  to  the  large  internal  binaural  noise.  It  should  be  noted  that  the  standard 
deviation  for  the  internal  binaural  noise  was  a parameter  that  was  tested  at  several  different 
values.  Values  larger  than  50  ms  did  not  noticeably  affect  the  output  of  the  model,  but 
smaller  value  improved  the  model’s  prediction  for  the  random-mode  binaural  case  to  a point 
greater  than  that  observed  in  the  data.  The  50-ms  standard  deviation,  which  does  not  seem 
to  be  an  unreasonable  value  seems  to  be  sufficient  to  predict  poor  performance  in  the 
random-mode  binaural  case. 

Obviously  other  choices  of  the  decision  rule  may  have  also  been  possible.  Two  other 
cases,  for  example,  may  have  been  to  combine  the  information  from  the  three  correlations 
either  1)  linearly,  or  2)  vectorially.  These  would  give  rise  to  a SAME  response  either  when 
1)  j-b+  r]  + ^ >C  or  when  2)/ (f  2 + f 2 +T 2)  > C.  While  the  decision  rule  selected  here  is 
somewhat  arbitrary,  it  is  guided  by  two  important  observations.  First,  consider  the  control 
condition  from  Experiment  II;  here  the  entire  sequence  was  presented  to  both  ears 
simultaneously.  That  is,  all  nine  tones  of  the  sequences  were  diotic.  The  individual  d's  for 
the  same-different  task  were  2.26,  2.93,  2.32,  and  1.87  for  the  four  subjects,  with  a mean 
value  of  2.34.  Compare  this  result  to  that  from  the  monaural  case  from  Experiment  III;  here 
the  entire  nine-tone  sequences  were  presented  to  only  the  left  ear.  The  d'  value,  calculated 
from  the  percent  correct  performance  for  the  same-different  task  (MacMillian  and  Creelman, 
1991),  for  the  same  four  subjects  were  3.10,  3.53,  2.79,  and  2.28,  with  a mean  d'  of  2.89.  It 


102 


is  evident  that  subjects  did  not  perform  better  using  both  ears  (diotic  binaural  case)  than  one 
ear  (monaural  case),  even  though  not  only  was  the  information  from  the  monaural  case 
present  for  the  diotic  case,  but  it  was  doubled.  Assuming  that  the  noise  that  limits  the 
calculation  of  correlations  is  independent  at  the  two  ears,  then  if  the  correlations  were 
somehow  summed  or  vectorially  added,  one  would  have  expected  better  performance  in  the 
binaural  diotic  case.  A t-test  between  the  distributions  of  d's  for  the  diotic  and  monaural 
conditions  showed  that  the  small  difference  in  mean  d's  is  statistically  insignificant  [t  (6)= 
1.69,  p<0.05)]. 

Second,  comparing  the  results  from  the  alternating  binaural  and  alternating  monaural 
conditions  (right  points  of  figure  31)  also  showed  a similar  pattern  of  outcomes.  The  - 
alternating  monaural  condition  produced  a slightly  higher  percent-correct  performance 
compared  to  the  alternating  binaural  condition.  Again,  this  is  not  what  one  would  predict 
from  combining  the  correlations.  It  is  not  quite  clear  why  in  both  of  these  examples  the 
binaural  cases  produced  a slightly  lower  level  of  performance.  Interestingly,  this  small 
difference  is  also  seen  in  simulations  and  may  be  partly  related  to  the  possibility  that  when 
the  maximum  of  3 correlations  are  considered,  on  the  DIFFERENT  trials  this  maximum 
may  surpass  the  criterion  and  lead  to  a SAME  response. 

These  results  suggest  that  as  far  as  the  discrimination  of  temporal  patterns  are 
concerned,  the  sequence  is  primarily  coded  through  monaural  tracts.  This  finding  was  very 
surprising  partly  because  one  may  have  speculated  that  the  coding  of  sequences,  of  the 
durations  examined  here  (300-700  ms),  are  likely  to  involve  much  higher-order  processes, 
and  therefore  expected  to  be  somewhat  channel  independent.  The  results  were  also 


103 


surprising  because  it  is  usually  presumed  that  the  binaural  system  is  efficient  in  maintaining 
timing  information.  The  bushy  cells  of  the  cochlear  nucleus  which  are  considered  important 
at  later  stages  for  binaural  processing  are  able  to  preserve  excellent  timing  information 
before  such  information  converges  from  the  two  ears  at  the  medial  superior  olive  (Pickles, 
1991).  Although  binaural  timing  information  is  usually  restricted  to  times  of  less  than  one 
ms  disparity  between  the  two  ears,  there  was  no  reason,  a priori  to  suspect  that  long  temporal 
separations  between  the  two  ears  could  not  also  be  efficiently  coded. 

It  should  also  be  noted  that  some  previous  work  on  temporal  coding  of  auditory 
stimuli  are  in  agreement  with  our  finding  on  dichotic  pattern  discrimination.  Deutsch 
(1979),  for  example,  studied  the  identification  of  melodic  patterns  presented  either  diotically 
or  dichotically  to  the  two  ears.  Results  showed  that  performance  suffered  when  the 
component  tones  of  the  melodic  pattern  were  switched  between  the  ears.  Other  studies  have 
demonstrated  a loss  in  timing  information  when  components  are  presented  through  different 
frequency  channels  (same  ear).  Divenyi  and  Danner  (1977),  for  example,  suggested  that 
when  timing  information  was  processed  across  spectrally  segregated  channels,  performance 
was  severely  degraded.  In  addition,  Divenyi  and  Sachs  (1978)  indicated  that  the 
discrimination  of  timing  accuracy  decreases  when  brief  time  intervals  that  are  bounded  by 
acoustic  markers  of  different  spectra  occur  in  different  input  channels.  Finally,  Bergman  and 
Campbell  (1978)  have  demonstrated  that  tasks  involving  within  stream  stimuli  are  much 
easier  for  the  listener  than  tasks  involving  across-stream  stimuli.  These  findings  are 
collectively  in  agreement  with  the  results  of  our  binaural  experiments  and  suggest  that 


104 


presenting  information  in  a random  order  either  to  different  ears  or  different  frequency 
channels  distorts  the  efficient  processing  of  temporal  information. 

Ideal  Distribution  of  Weights  in  a Pattern  Discrimination  Task 

In  all  the  experiments  described  here,  it  was  shown  that  subjects  give  some 
components  more  weight  than  others.  The  intertone  time  at  the  first  position,  particularly, 
received  a substantial  weight  compared  to  other  intertone  times.  In  some  of  the  experiments, 
in  addition  to  the  first  intertone  time,  other  positions  also  received  a large  weight  compared 
to  those  given  to  remaining  positions;  this  latter  case  was  particularly  evident  when  the 
standard  deviation  of  the  perturbation  for  one  of  the  components  was  increased  (Figures  23- 
26).  The  analysis  of  the  distribution  g shows  that  an  increase  in  perturbation  is  analogous 
to  an  effective  increase  in  signal  level,  and  therefore,  it  was  not  surprising  that  such  a 
component  would  receive  more  weight.  The  question  addressed  in  this  section  is  whether 
such  distribution  of  weights  is  an  optimum  strategy.  Intuitively,  it  seems  logical  that  an 
increase  in  a perturbation  which  increases  signal  level  would  be  weighted  more  heavily, 
however,  how  much  more  heavily  should  it  be  weighted?  In  the  case  where  all  components 
have  equal  means  and  equal  variances,  how  should  the  weights  be  distributed?  Is  giving  the 
first  component  a greater  weight  than  others  a good  strategy  when  all  intertone  components 
have  equal  parameters? 

We  will  distinguish  between  two  cases.  First,  the  cases  where  all  components 
(intertone  times)  have  equal  means  and  equal  standard  deviation  of  perturbations.  Second, 


105 


we  will  consider  the  case  where  the  perturbation  of  some  components  has  a greater  standard 
deviation  than  others.  For  the  first  case,  ideal  weights  may  be  derived  according  to  Berg's 
(1990)  analysis.  The  index  of  sensitivity,  d',  may  be  written  for  intertone-time  sequences  as 


(29) 


(2a;E[g]) 

d'  = -^ 


( 


£a,20 

(=i 


where  g=  \tu-Uj  is  the  absolute  value  of  the  difference  distribution  and  q„t  in  Eq.  29  is  the 
standard  deviation  of  internal  noise.  Note  that  the  only  factor  that  limits  the  ability  of 
listeners  to  detect  SAME  trials  is,  in  fact,  the  internal  noise  associated  with  the  intertone 
times.  If  internal  noise  was  zero,  and  because  the  two  sequences  on  the  SAME  trials  are 
perfectly  correlated,  then  subjects  would  perform  at  perfect  detection.  Because  all  intertone 
times  have  equal  means  and  perturbations  in  this  case,  then  E [g]  may  be  factored  out 


(30) 


(EfejSa,) 

,/_  i=i 


d'  = 


, (S*,20 

\ /= i 


and  since  Sa.=  1 then 


M 


(31)  d'oc. 


106 


The  distribution  of  weights,  would  be  ideal  when  the  value  of  d'  in  Eq.  30  is  maximum, 

which  occurs  when  the  denominator  is  at  its  minimum.  By  taking  the  partial  derivatives  of 
the  denominator  with  respect  to  a,  and  oml , Berg  (1990)  has  shown  that  the  ideal  weight  for 
component  i is 

(32)  3 = -i 

(° »S(0 ') 

1 = 1 

If  we  make  the  reasonable  assumption  that  the  internal  noise  that  limits  the  discriminability 
of  intertone  times  has  the  same  standard  deviation  for  all  components,  particularly  when  the 
stimulus  parameters  are  the  same,  then 

(33)  V in1 

where  m is  the  number  of  intertone  times.  Eq.  33  states  that  the  ideal  weight-distribution 
strategy  in  this  case  requires  that  all  components  receive  equal  weighting. 

Figure  7 (Experiment  I),  shows  that  listeners  do  not  distribute  weights  equally  but 
seem  to  weight  the  first  intertone  time  more  heavily  than  other  intertone  times.  Thus, 
listeners  are  not  making  efficient  use  of  the  information  provided  in  the  intertone  times  in 
this  case.  There  are  many  reasons  why  such  non-optimum  distribution  of  weights  is 
observed.  It  may  simply  be  that  the  novelty  of  the  beginning  part  of  a sequence  captures  a 
listener's  attention  more  heavily  than  the  ongoing  part  of  the  sequence,  or  may  be,  the 


107 


processing  of  the  sequence  requires  some  delay;  the  rapid  occurrence  of  intertone  times  may 
result  in  a build  up  of  cognitive  load  during  the  occurrence  of  the  later  intertone  times, 
resulting  in  less  weight  given  to  these  later  components.  If  we  consider  Watson  et  al.  (1975) 
data,  which  is  in  contrast  to  the  data  of  the  current  study,  that  the  latest  occurring  component 
receive  the  greatest  attention,  one  may  still  conclude  that  the  listeners  are  distributing 
weights  among  components  in  a non-optimum  way.  In  a few  cases  of  the  data  of  the  current 
study,  there  is  evidence  that  the  latest-occurring  intertone  time  carry  a slightly  larger  weight. 
In  these  cases,  the  subject  may  be  limited  by  memory  effects,  what  Watson  et  al.  refer  to  as 
"recency  effect";  one  may  argue  that  listeners  may  remember  the  most  recent  events  better 
and  therefore  weight  them  more  heavily,  even  though  this  may  not  be  the  best  strategy  to 
extract  the  maximum  information  from  the  stimulus. 

For  the  second  case  of  experiment  2,  the  standard  deviation  of  the  perturbations  was 
not  equal  among  all  the  components.  Therefore,  the  derivation  of  ideal  weights  will  not  be 
the  same  as  the  above  case.  In  order  to  derive  the  ideal  weights  for  this  case,  first  we  needed 
to  establish  a relationship  between  the  standard  deviation  of  the  perturbations  and  d'.  In 
chapter  2,  this  relationship  was  approximated  as  d-  (0.03oti)A0.6.  Next,  we  must  determine 
a relation  between  ideal  weight  and  d’.  The  d'  for  a single  component  may  be  written  as  d' 
= (a^  a p/o^  ) , or  equivalently,  d’=  Ag/^,  £'*).  Decision  theory  requires  that  the  proper 
weighting  for  multiple-component  signals  be  inversely  proportional  to  the  variances  of  the 
components  (Van  Trees,  1968).  If,  therefore,  the  d'  for  one  component  is  twice  that  of  a 
second  component,  then  it  should  consequently  be  weighted  4 times  as  much.  That  is,  the 
ideal  weight  a,,  before  normalization  to  unity,  should  be  proportional  to  the  square  of  d' 


108 


(34)  V d'2 

and,  therefore,  related  to  the  standard  deviation  of  perturbations  by 

(35)  V [(0.03oti)A0.6  ]2 

For  one  such  condition,  figures  26  A-C  (where  the  standard  deviation  of  the  perturbations 
for  the  sixth  component  is  varied  compared  to  other  components)  the  ideal  weight  is  plotted 
in  figures  33  A-C  together  with  the  actual  weights  measured  for  these  perturbations.  The 
listeners  give  more  weight  to  the  intertone  time  with  the  greater  standard  deviation,  as 
expected,  and  in  a way  that  is  in  the  same  direction  as  the  predictions  from  ideal  distribution 
of  weights  with  a smaller  magnitude  than  the  ideal. 

Efficiency  of  Weight  Distributions 

A comparison  of  the  listener's  performance  to  that  of  the  ideal  can  provide  a measure 
of  how  efficiently  listeners  process  the  sequence  information  derived  from  intertone  times. 
A listener's  efficiency  in  a psychophysical  task  has  been  described  by  Tanner  and  Birdsall 
(1958)  as. 


(36)  ti  - (d’obs  / d'ideal)2 


RELATIVE  WEIGHT 


109 


Figure  33  A.  The  data  connected  by  the  solid  line  represent  the  average  relative  weights, 
from  all  listeners,  as  a function  of  each  temporal  position.  The  dashed  line  represents 
the  ideal  weights.  The  standard  deviation  for  the  sixth  position  was  40  ms. 


RELATIVE  WEIGHT 


110 


TEMPORAL  POSITION 


Figure  33  B.  The  data  connected  by  the  solid  line  represent  the  average  relative  weights, 
from  all  listeners,  as  a function  of  each  temporal  position.  The  dashed  line  represents 
the  ideal  weights.  The  standard  deviation  for  the  sixth  position  was  60  ms. 


RELATIVE  WEIGHT 


111 


Figure  33  C.  The  data  connected  by  the  solid  line  represent  the  average  relative  weights, 
from  all  listeners,  as  a function  of  each  temporal  position.  The  dashed  line  represents 
the  ideal  weights.  The  standard  deviation  for  the  sixth  position  was  100  ms. 


112 


This  is  a useful  measure  of  performance  when  the  d's  for  various  conditions  are  different  and 
therefore  difficult  to  compare  unless  when  measured  against  ideal.  The  definition  arises 
from  the  fact  that  the  efficiency  of  a system  is  usually  defined  as  an  energy  ratio  (Tanner  and 
Birdsall,  1958)  and  because  ideal  d’  is  related  to  signal  energy  by  d-  V(2E/N0),  then 
efficiency  is  a function  of  squared  d's.  In  the  previous  section,  it  was  shown  that  subjects 
do  not  usually  distribute  the  weights  according  to  the  expected  ideal.  This  was  particularly 
evident  for  the  case  of  Experiment  I where  the  first  intertone  time  carried  more  weight  than 
other  intertone  times  of  the  sequence.  A similar  outcome  is  also  observed  for  the  first  case 
of  Experiment  II,  where  the  mean  of  the  intertone  time  was  varied;  again  the  first  component 
carried  more  weight.  In  this  section  efficiencies  will  be  calculated  for  the  important 
conditions  where  the  initial  parts  of  a sequence  carry  more  weight  than  other  parts  (first  parts 
of  Experiment  I and  II).  Since  all  the  intertone  times  within  the  sequences  have  equal 
perturbations  (Oi  = o2  =. . = on),  for  Experiment  I and  the  first  case  of  Experiment  II,  then  as 
described  in  the  previous  section,  the  ideal  weights  can  be  written  as  a^  m . Berg  (1990) 
has  shown  that  by  substituting  the  definition  of  d'  (Eq.  29)  into  Eq.  36  and  simplifying,  one 
can  derive  an  efficiency  measure  for  weight  distribution  as 

(37)  q = Sa^/Sa2 


Figure  34  represents  the  mean  efficiencies  of  listeners'  weighting  distribution  for  the 
first  part  of  Experiment  1,  where  the  pattern  discrimination  task  was  examined  with  two 
different  correlation  values  between  sequences.  The  efficiency  calculations  were  based  on 


EFFICIENCY  11 


113 


SEQUENCE  CORRELATION 


Figure  34.  Efficiency  calculations  from  Experiment  1.  The  ordinate  represents  the 
efficiency  value  and  the  abscissa  represents  the  sequence  correlation. 


114 


the  average  weight  distributions  from  the  four  listeners.  It  is  evident  that  listeners  are  able 
to  distribute  weights  reasonably  efficiently,  and  the  increased  weight  given  to  the  first 
intertone  time  is  not  a great  cost  in  efficiency. 

Figures  35  and  36  depict  the  mean  efficiencies  of  listeners'  weighting  distribution  for 
the  first  case  of  Experiment  2.  For  this  case,  the  mean  intertone  time  for  one  of  the  temporal 
positions  (either  2nd  or  6th  position)  had  a mean  that  was  different  than  that  of  the  other 
positions  in  the  sequence.  Figure  35  shows  efficiencies  for  the  case  where  the  second 
position  had  a different  mean,  and  figure  36  shows  these  efficiencies  for  the  6th  position. 
Again,  in  both  cases,  it  seems  that  the  listeners'  efficiencies  in  distributing  weights  is  very 
high,  with  a slight  tendency  for  the  sixth-position  case  to  produce  a higher  efficiency.  It  can 
be  argued  that  the  high  efficiencies  seen  here,  is  another  indication  that  the  proportion-of- 
the-total-pattem-duration  (PTD)  rule  does  not  make  any  given  component  more  perceptually 
salient  than  others.  Although  one  component  may  have  had  a large  mean  intertone  time,  the 
efficiency  did  not  drop  substantially,  indicating  a reasonably  equal  distribution  of  weights 
among  components. 


Guidelines  for  Future  Experiments 

The  current  experiments  examined  some  basic  conditions  in  temporal  sequence 
discrimination.  The  primary  parameters  studied  included  the  estimation  of  weights  given 
to  each  component  of  a sequence,  presented  monaurally,  diotically,  or  random-binaurally 
(dichotic),  and  for  conditions  where  the  mean  or  standard  deviation  of  intertone  times  were 


EFFICIENCY 


115 


1.00 


C\J 


w 


eg 


0.90 


Y5 

W 


0.80 


or 

0.70 
0.60 
0.50 


Figure  35.  Efficiency  calculations  from  part  1 of  Experiment  2.  The  ordinate  represents  the 
efficiency  as  a function  of  the  value  of  the  mean  intertone  time  for  the  second 
position. 


116 


Figure  36.  Efficiency  calculations  from  part  1 of  Experiment  2.  The  ordinate  represents  the 
efficiency  as  a function  of  the  value  of  the  mean  intertone  time  for  the  sixth  position. 


117 


unequal  among  components.  The  reported  results  provide  a basis  for  future  experiments. 

In  particular  the  results  of  the  binaural  experiments  are  surprising  and  one  interesting  area 
for  further  investigation. 

One  parameter  of  interest  for  the  binaural  paradigms  is  the  total  duration  of  the 
sequence  or  equivalently  the  mean  duration  of  intertone  times.  It  is  well  known,  as  cited 
earlier,  that  the  binaural  system  is  considered  to  have  a good  ability  to  maintain  timing 
information  for  events  that  occur  nearly  simultaneously,  within  one  ms.  Though  the  question 
may  become  phenomenologically  different,  one  may  examine  binaural  and  monaural 
temporal-sequence  discrimination  for  very  short  intertone  times.  In  this  case,  the  appropriate 
stimuli  may  be  brief  pulses  (1  or  2 ms  in  duration),  which  are  probably  more  suitable  when 
very  short  intervals  are  considered.  For  the  longer  sequences  used  in  the  current  study  of 
binaural  effects,  one  may  also  examine  the  effect  of  very  large  standard  deviations  for  some 
intertone  times  of  the  sequence.  It  is  likely  that  in  the  random-binaural  case,  increasing  the 
standard  deviation  of  one  intertone  time  may  improve  performance;  the  reason  is  that  such 
a large  variability  in  one  or  more  components  may  allow  better  discriminability,  as  expected 
from  the  characteristics  of  the  distributions  described  (e.g.,  g distribution). 

As  the  properties  of  sequence  coding  at  the  basic  levels  described  here  is  better 
understood,  one  may  consider  temporal  sequence  coding  in  more  complicated  situations. 
For  example,  one  may  consider  temporal  pattern  discrimination  in  the  presence  of  other 
interfering  stimuli.  How  would  the  performance  of  an  observer,  and  her  weight-distribution 
strategy  be  altered  as  a function  of  signal-to-noise  ratio:  i.e.,  the  sound-pressure  level  of  the 
sequence  or  the  sequence  presented  in  a background  of  noise?  We  assume  that  energetic 


118 


masking  will  eventually  degrade  performance,  however,  its  effect  on  weight  distribution 
among  the  component  intertone  times  is  not  clear.  This  is  an  interesting  area  of  study  which 
attempts  to  move  our  work  closer  to  more  acoustically  complex  environments,  more 

appropriate  for  natural  listening  conditions. 

Finally,  one  may  also  examine  temporal  pattern  discrimination  in  the  presence  of 
informational  masking.  In  a recent  paper,  Kidd  et  al.  (1995)  have  shown  that  the  detection 
of  tonal  sequences  of  differing  frequencies  are  made  more  difficult  to  discriminate  in  the 
presence  of  other  sequences.  Similarly,  one  may  consider  the  coding  and  discriminability 
of  intertone  sequences  in  the  presence  of  others  sequences  that  interfere,  informationally, 
with  the  signal  sequence.  The  interfering  sequence  may  either,  overlap  partially  or  entirely 
in  time  with  the  signal  sequence,  it  may  have  a different  mean  or  variance  of  intertone  times 
than  those  of  the  signals'  sequence,  it  may  be  contained  in  a different  critical  band,  or  its 
frequencies  may  be  randomized  to  better  understand  the  effects  of  informational  interference 
with  the  coding  of  intertone  times  in  sequences. 


REFERENCES 


Berg,  B.  G.  (1989).  Analysis  of  weights  in  multiple  observation  tasks.  Journal  of  the 
Acoustical  Society  of  America.  86.  1743-1746. 

Berg,  B.  G.  (1990).  Observer  efficiency  and  weights  in  a multiple  observation  task.  Journal 
of  the  Acoustical  Society  of  America.  88,  149-158. 

Berg,  B.  G.  & Green.  D.  M.  (1990).  Spectral  weights  in  profile  listening.  Journal  of  the 
Acoustical  Society  of  America.  M,  758-766. 

Campbell,  F.  W.  & Robson,  J.  G.  (1968).  Application  of  Fourier  analysis  to  the  visibility 
of  gratings.  Journal  of  Physiology  (London),  197,  551-566. 

Colavita,  F.  B.  (1972).  Auditory  cortical  lesions  and  visual  pattern  discrimination  in  cat. 
Brain  Research.  39,  437-447. 

Colavita,  F.  B.,  Szeligo,  F.  V.  & Zimmer,  S.  D.  (1974).  Temporal  pattern  discrimination  in 
cats  with  Insular-Temporal  lesions.  Brain  Research,  79,  153-156. 

Collins,  L.  M.,  Wakefield,  G.  H.  & Feinman,  G.  R.  (1994).  Temporal  pattern  discrimination 
and  speech  recognition  under  electrical  stimulation.  Journal  of  the  Acoustical  Society 
of  America.  96,  2731-2737. 

Cornwell,  P.  (1970).  Loss  of  auditory  pattern  discrimination  following  Insular-temporal 
lesion  in  cats.  Journal  of  Comparative  and  Physiological  Psychology,  63,  165-168. 

Deutsch,  D.  (1979).  Binaural  integration  of  melodic  patterns.  Perception  & Psychophysics, 
25,  399-405. 


DeValois,  R.  L.  & DeValois,  K.  K.  (1988).  Spatial  vision.  New  York:  Oxford  University 
Press. 

Dewson,  J.  H.,  Cowey,  A.  & Weiskrantz,  L.  (1970).  Disruption  of  auditory  sequence 
discrimination  by  unilateral  and  bilateral  cortical  ablations  of  Superior  Temporal 
Gyrus  in  the  monkey.  Experimental  Neurology,  28,  529-548. 


119 


120 


Diamond,  I.  T.  & Neff,  W.  D.  (1956).  Ablation  of  temporal  cortex  and  discrimination  of 
auditory  patterns.  Journal  ofNeurophvsiology.  20,  300-315 

Dodwell,  P.  C.  (1970).  Visual  pattern  recognition.  New  York:  Holt,  Rinehart  and  Winston, 
Inc. 

Durlach,  N.  I.  (1972).  Binaural  signal  detection:  equalization  and  cancellation  theory.  In 
J.  V.  Tobias  (Eds.).  Foundation  of  Modem  Auditory  Theory,  Volume  II.  New  York: 
Academic  Press. 

Espinoza- Varas,  B.  & Watson,  C.  S.  (1986).  Temporal  discrimination  for  single  components 
of  nonspeech  auditory  patterns.  Journal  of  the  Acoustical  Society  of  America.  80 
1685-1694. 

Fraisse,  P.  (1966).  L’anticipation  de  stimulus  rythmiques.  Vitesse  d’etablissment  et 
precision  de  la  Synchronisation.  L’Annee  Psvchologique.  66,  15-36. 

Fraisse,  P.  (1982).  Rhythm  and  Tempo.  In  D.  Deutsch  (Eds.).  The  Psychology  of  Music, 
(pp  149-180).  New  York:  Academic  Press  Inc. 

Green,  D.  M.  (1988).  Profile  analysis:  Auditory  intensity  discrimination.  New  York, 
Oxford:  Oxford  University  Press. 

Green,  D.  M.,  Kidd,  G.  & Picardi,  M.  C.  (1983).  Successive  versus  simultaneous 

comparison  in  auditory  intensity  discrimination.  Journal  of  the  Acoustical  Society 
of  America.  73.  639-643. 

Green,  D.  M.  & Mason,  C.  R.  (1985).  Auditory  profile  analysis:  Frequency,  phase,  and 
Weber’s  Law.  Journal  of  the  Acoustical  Society  of  America.  77,  1 155-1 161 . 

Green,  D.  M.,  & Swets,  J.  A.  (1966).  Signal  detection  theory.  New  York:  Wiley. 

Halpem,  A.  R.  & Darwin  C.  J.  (1982).  Duration  discrimination  in  a series  of  rhythmic 
events.  Perception  & Psychophysics.  31,  86-89. 

Hays,  W.  (1963).  Statistics  for  Psychologists.  New  York:  Holt,  Rinehart  and  Winston,  Inc. 

Hubei,  D.  H.  & Wiessel,  T.  N.  (1963).  Shape  and  arrangements  of  columns  in  cat’s  striate 
cortex.  Journal  of  Physiology.  165.  559-568. 

Jeffress,  L.  A.  & Robinson,  D.  E.  (1962).  Formulas  for  the  coefficient  of  interaural 

correlation  for  noise.  Journal  of  the  Acoustical  Society  of  America,  34,  1658-1659. 


121 


Jones,  M.  R.,  Kidd,  G.  R.  & Wetzel,  R.  (1981).  Evidence  for  rhythmic  attention.  Jmirnal 
’ nf  Experimental  Psychology  Human  Perception  and  Performancg,  7,  1059-1073. 

Karasseva,  T.  A.  (1972).  The  role  of  the  temporal  lobe  in  human  auditory  perception. 
Neuropsvchologia,  JO,  227-23 1 . 

Kidd,  G.,  Mason,  C.  R.,  Uchanski,  R.  M„  Brantley,  M.  A.  & Shah,  P.  (1991).  Evaluation 
of  simple  models  of  auditory  profile  analysis  using  random  reference  spectra.  Journal 
of  the  Acoustical  Society  of  America,  90,  1340-1354 

Kidd,  G.  R.  & Watson,  C.  S.  (1992).  The  “proportion-of-the-total-duration-rule”  for  the 
discrimination  of  auditory  patterns.  Journal  of  the  Acoustical  Society  of  America, 
92,  3109-3118. 

Kidd  ,G.,  Mason,  C.  R.  & Rohtla,  T.  L.  (1995).  Binaural  advantages  for  sound  pattern 
’ identification.  Journal  of  the  Acoustical  Society  of  America,  98,  1977-1986. 

Klatt,  D.  H.  (1976).  Linguistic  uses  of  segmental  duration  in  English:  Acoustic  and 

perceptual  evidence.  Journal  of  the  Acoustical  Society  of  America,  59,  1208-1221. 

Liberman,  A.  M.,  Cooper,  F.  S„  Shankweiler,  D.  P.  & Studdert-Kennedy,  M.  (1967). 
Perception  of  speech  code.  Psychological  Review,  74,  431-461. 

Licklider,  J.  C.  & Dzendolet,  E.  (1948).  Oscillographic  scatter  plots  illustrating  various 
degrees  of  correlation.  Science.  107,  121-124. 

Lindsay,  P.  H.  & Norman,  D.  A.  n 972)  Human  information  processing:  An  introduction 
to  psychology.  New  York:  Academic  Press. 

Luria,  A.  R.  (1966).  Higher  cortical  functions  in  man.  Basic  Books:  New  York. 

Lutfi,  R.  A.  (1992).  Comments  on  “Analysis  of  weights  in  multiple  observation  tasks”. 
Journal  of  the  Acoustical  Society  of  America,  91,  507-508. 

Lutfi,  R.  A.  (1993).  A model  of  auditory  pattern  analysis  on  component  relative-entropy. 
’ journal  of  the  Acoustical  Society  of  America,  94,  748-758. 

Lutfi,  R.  A.  (1995).  Correlation  coefficients  and  correlation  ratios  as  estimates  of  observer 
weights  in  multiple-observation  tasks.  Journal  of  the  Acoustical  Society  of  America, 
97,  1333-1334. 

MacMillian,  N.  A.  & Creelman,  C.  D.  (1991).  Detection  theory:  A usgr’$  gyide. 
Cambridge:  Cambridge  University  Press. 


122 


Marr,  D.  (1982).  Vision:  A computational  investigation  into  the  human  representation  and 
processing  of  visual  information.  New  York:  Freeman 

Martin,  J.  G.  (1972).  Rhythmic  (hierarchical)  versus  serial  structure  in  speech  and  other 
behavior.  Psychological  Review.  79,  487-509. 

Martin,  J.  G.  & Struges,  P.  T.  (1974).  Rhythmic  structures  in  auditory  temporal  pattern 
perception  and  immediate  memory.  Journal  of  Experimental  Psychology,  102,  377- 
383. 

Monahan,  C.  B.  & Hirsh,  I.  J.  (1990).  Studies  in  auditory  timing:  2.  Rhythm  patterns. 
Perception  & Psychophysics.  47.  227-242. 

Montgomery,  D.  A.  & Sorkin,  R.  D.  (In  Press).  Observer  sensitivity  to  the  differential 
reliability  of  visual  display  elements.  Human  Factors. 


Movshon,  J.  A.,  Adelson,  E.  H„  Gizzi,  M.  S.  & Newsome,  W.  T.  (1985).  The  analysis  of 
moving  visual  patterns.  In  C.  Chagas,  R.  Gattass,  and  C.  Gross  (Eds.).  Pattern _ 
Recognition  Mechanism,  (pp  117-151).  New  York:  Springer. 

Moore,  B.  C.  J.  (1988).  An  introduction  to  the  psychology  of  hearing.  New  York: 
Academic  Press. 

Murch,  G.  M.  (1973).  Visual  and  auditory  perception.  New  York:  The  Bobbs-Merrill 
Company,  Inc. 

Pickles,  J.  0.(1991).  An  introduction  to  the  physiology  of  hearing.  New  York:  Academic 
Press. 

Pitman,  J.  (1993).  Probability.  New  York:  Springer- Verlag 

Richards,  V.  M.  & Zhu,  S.  (1994).  Relative  estimates  of  combination  weights,  decision 
criteria,  and  internal  noise  based  on  correlation  coefficients.  Journal  of  the. 
Acoustical  Society  of  America.  95.  424-434. 

Robin,  D.  A.,  Royer,  F.  L.  & Abbas,  P.  J.  (1987).  The  perception  of  repetitive  auditory 

temporal  patterns.  In  W.  A.  Yost  and  C.  S.  Watson  (Eds.).  Auditory  processing  of 
complex  sounds,  (pp  87-103).  New  Jersey:  Lawrence  Erlbaum  Associates, 
Publishers. 

Sadralodabai,  T.,  Sorkin,  R.  D.,  & Montgomery,  D.  A.  (1993).  Serial  position  effects  in 
temporal  pattern  discrimination.  Journal  of  the  Acoustical  Society  of  America.  93, 
2385-2386  (abstract). 


123 


Siebert,  W.  M.  (1968).  Stimulus  transformations  in  the  peripheral  auditory  system.  In  P.  A. 
Kolers  and  M.  Eden  (Eds.).  Recognizing  patterns.  Cambridge:  MIT  Press. 

Sorkin,  R.  D.  (1990).  Perception  of  temporal  patterns  defined  by  tonal  sequences.  Journal 
of  the  Acoustical  Society  of  America.  87.  1695-1701. 

Sorkin,  R.  D.  & Montgomery,  D.  A.  (1991).  Effect  of  time  compression  and  expansion  on 
the  discrimination  of  tonal  patterns.  Journal  of  the  Acoustical  Society  of  America. 
90,  846-857. 

Steedman,  M.  J.  (1977).  The  perception  of  musical  rhythm  and  meter.  Perception.  6,  555- 
569. 

Stevens,  K.  N.  & House,  A.  S.  (1972).  Speech  perception.  In  J.  T.  Tobias  (Eds.). 

Foundations  of  modern  auditory  theory.  Volume  II.  New  York:  Academic  Press. 

Tanner,  W.  P.  & Birdsall,  T.  G.  (1958).  Definitions  of  d’  and  (eta)  as  psychophysical 
measures.  Journal  of  the  Acoustical  Society  of  America.  30,  922-928. 

van  Trees,  H.  L.  (1968).  Detection,  estimation,  and  modulation  theory.  Part  I New  York: 
Wiley. 

Voss,  J.  & Rasch,  R.  (1981).  The  perceptual  of  musical  tones.  Perception  & Psychophysics 
29,  323-335. 


Watson,  C.  S.,  Kelly,  W.  J.  & Wroton,  H.  W.  (1976).  Factors  in  the  discrimination  of  tonal 
patterns.  II.  Selective  attention  and  learning  under  various  levels  of  stimulus 
uncertainty.  Journal  of  the  Acoustical  Society  of  America.  57,  1175-1185. 

Watson,  C.  S.,  Wroton,  H.  W.,  Kelly,  W.  J.  & Benbasset,  C.  A.  (1975).  Factors  in  the 
discrimination  of  tonal  patterns.  I.  Component  frequency,  temporal  position,  and 
silent  intervals.  Journal  of  the  Acoustical  Society  of  America.  57,  1175-1185. 

Zara,  J.,  Onsan,  Z.  A.  & Nguyen,  Q.  T.  (1993).  Auditory  profile  analysis  of  harmonic 
signals.  Journal  of  the  Acoustical  Society  of  America.  93,  3431-3441. 


BIOGRAPHICAL  SKETCH 


Toktam  Sadralodabai  was  bom  August  18,  1963,  in  Mashhad,  Iran.  She  received  her 
B A degree  in  psychology  1991  from  the  California  State  University,  Los  Angeles,  and 
Ph  D.  degree  from  the  University  of  Florida.  Toktam  s research  interests  are  best 
characterized  as  the  study  of  human  auditory  performance,  perception,  and  psychophysics. 


124 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 

. 

Robert  D.  Sorkin,  Chair 
Professor  of  Psychology 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 

/^)aAn'J  ya  iL — -ci 

David  M.  Green 
Graduate  Research 
Professor  of  Psychology 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


Associate  Professor  of  Psychology 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy 


David  R.  Perrott 
Professor  of  Psychology 
California  State  University 
Los  Angeles 


I certify  that  I have  read  this  study  and  that  in  my  opinion  it  conforms  to 
acceptable  standards  of  scholarly  presentation  and  is  fully  adequate,  in  scope  and  quality, 
as  a dissertation  for  the  degree  of  Doctor  of  Philosophy. 


£ 


John  C.  Middlebrooks 
y Associate  Professor  of  Neuroscience 


This  dissertation  was  submitted  to  the  Graduate  Faculty  of  the  Department  of 
Psychology  in  the  College  of  Liberal  Arts  and  Sciences  and  to  the  Graduate  School  and 
was  accepted  as  partial  fullifillment  of  the  requirements  for  the  degree  of  Doctor  of 
Philosophy. 


May,  1996 


Dean,  Graduate  School 


